[ { "path": "CVPR2020.md", "content": "\r\n* 推荐阅读：
\r\n * [2020-2021年计算机视觉综述论文汇总](https://github.com/extreme-assistant/survey-computer-vision)
\r\n * [2019-2020年目标跟踪资源全汇总（论文、模型代码、优秀实验室）](https://github.com/extreme-assistant/Object-Tracking-Paper-Benchmark-Team)\r\n\r\n\r\n------\r\n\r\n
\r\n\r\n# CVPR2020最新信息及论文下载贴（Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等）\r\n\r\n\r\n官网链接：http://cvpr2020.thecvf.com/
\r\n时间：Seattle, Washington，2020年6月14日-6月19日
\r\n论文接收公布时间：2020年2月24日
\r\n\r\n相关问题：\r\n* [如何评价2020年计算机视觉顶会CVPR投稿量破万的现象?](https://www.zhihu.com/question/356099725/)
\r\n* [如何评价 CVPR 2020的论文接收结果？有哪些亮点论文？](https://www.zhihu.com/question/372070853)

\r\n\r\n# 总目录\r\n[1. CVPR2020接收论文分类汇总（持续更新）](#100)
\r\n[2. CVPR2020 Oral（持续更新）](#101)
\r\n[3. CVPR2020 论文解读](#102)
\r\n[4. To do list](#103)
\r\n[5. Related works](#104)
\r\n\r\n\r\n

\r\n\r\n\r\n\r\n# 1.CVPR2020接收论文（持续更新）
\r\n\r\n### 分类汇总
\r\n* [20.CVPR 2020 论文大盘点-动作检测与动作分割篇](https://bbs.cvmart.net/topics/3028)
\r\n\r\n* [19.CVPR 2020 论文大盘点-动作识别篇](https://bbs.cvmart.net/topics/3000)
\r\n \r\n* [18.CVPR 2020 论文大盘点-光流篇](https://bbs.cvmart.net/topics/2992)
\r\n \r\n* [17.CVPR 2020 论文大盘点-图像与视频检索篇](https://bbs.cvmart.net/topics/2964)
\r\n\r\n* [16.CVPR 2020 论文大盘点-遥感与航拍影像处理识别篇](https://bbs.cvmart.net/topics/2953)
\r\n \r\n* [15.CVPR 2020 论文大盘点-图像质量评价篇](https://bbs.cvmart.net/topics/2923)
\r\n \r\n* [14.CVPR 2020 论文大盘点-图像修复 Inpainting 篇](https://bbs.cvmart.net/topics/2903)
\r\n \r\n* [13.CVPR 2020 论文大盘点-图像增强与图像恢复篇](https://bbs.cvmart.net/topics/2902)
\r\n \r\n* [12.CVPR 2020 论文大盘点-去雨去雾去模糊篇](https://bbs.cvmart.net/topics/2876)
\r\n\r\n* [11.CVPR 2020 论文大盘点-医学影像处理识别篇](https://bbs.cvmart.net/topics/2855)
\r\n\r\n* [10.CVPR 2020 论文大盘点-抠图 Matting 篇](https://bbs.cvmart.net/topics/2854)
\r\n\r\n* [9.CVPR 2020 论文大盘点-图像分割完整篇](https://bbs.cvmart.net/topics/2829)
\r\n\r\n* [8.CVPR 2020 论文大盘点-全景分割与视频目标分割篇](https://bbs.cvmart.net/topics/2818)
\r\n\r\n* [7.CVPR 2020 论文大盘点-超分辨篇](https://bbs.cvmart.net/topics/2725)
\r\n \r\n* [6.CVPR 2020 论文大盘点-目标检测篇](https://bbs.cvmart.net/topics/2732)
\r\n \r\n* [5.CVPR 2020 论文大盘点-人脸技术篇](https://bbs.cvmart.net/topics/2720 )
\r\n \r\n* [4.CVPR 2020 论文大盘点-目标跟踪篇](https://bbs.cvmart.net/topics/2733 )
\r\n \r\n* [3.CVPR 2020 论文大盘点-文本图像篇](https://bbs.cvmart.net/topics/2778 )
\r\n \r\n* [2.CVPR 2020 论文大盘点-行人检测与重识别篇](https://bbs.cvmart.net/topics/2751)
\r\n \r\n* [1.CVPR 2020 论文大盘点-实例分割篇](https://bbs.cvmart.net/topics/2806)
\r\n\r\n\r\n
\r\n\r\n### 目录
\r\n[1. 目标检测](#1)
\r\n[2. 人脸识别](#3)
\r\n[3. 目标跟踪](#4)
\r\n[4. 三维点云/三维重建/三维检测/三维分割/深度估计](#5)
\r\n[5. 图像识别](#18)
\r\n[6. 图像处理](#6)
\r\n[7. 图像分类](#7)
\r\n[8. 图像分割](#2)
\r\n[9. 姿态估计/动作识别](#8)
\r\n[10. 视频分析](#9)
\r\n[11. OCR](#10)
\r\n[12. GAN](#11)
\r\n[13. 小样本/零样本](#12)
\r\n[14. 弱监督/无监督/自监督](#13)
\r\n[15. 行人跟踪/行人检测/ReID](#14)
\r\n[16. 神经网络/模型加速/模型压缩](#15)
\r\n[17. 超分辨率](#16)
\r\n[18. 视觉常识/数据集/其他](#17)
\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 目标检测\r\n\r\n1. Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
\r\n论文地址：https://arxiv.org/abs/1912.02424
\r\n代码：https://github.com/sfzhang15/ATSS

\r\n\r\n2. Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector
\r\n论文地址：https://arxiv.org/abs/1908.01998

\r\n\r\n3. AugFPN: Improving Multi-scale Feature Learning for Object Detection
\r\n论文地址：https://arxiv.org/abs/1912.05384

\r\n\r\n4. Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
\r\n论文地址：https://arxiv.org/abs/2003.11818
\r\n代码：https://github.com/ggjy/HitDet.pytorch

\r\n\r\n5. Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
\r\n论文地址：https://arxiv.org/abs/2003.08813

\r\n\r\n6. CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection
\r\n论文地址：https://arxiv.org/abs/2003.09119
\r\n代码：https://github.com/KiveeDong/CentripetalNet

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 人脸识别\r\n\r\n1. Towards Universal Representation Learning for Deep Face Recognition
\r\n论文地址：https://arxiv.org/abs/2002.11841

\r\n\r\n2. Suppressing Uncertainties for Large-Scale Facial Expression Recognition
\r\n论文地址：https://arxiv.org/abs/2002.10392
\r\n代码：https://github.com/kaiwang960112/Self-Cure-Network

\r\n\r\n3. Face X-ray for More General Face Forgery Detection
\r\n论文地址：https://arxiv.org/pdf/1912.13458.pdf

\r\n\r\n4. Pose Agnostic Cross-spectral Hallucination via Disentangling Independent Factors
\r\n论文地址：https://arxiv.org/abs/1909.04365

\r\n\r\n5. Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing
\r\n论文地址：https://arxiv.org/abs/2003.08061
\r\n代码：https://github.com/clks-wzz/FAS-SGTD

\r\n\r\n6. Learning Meta Face Recognition in Unseen Domains
\r\n论文地址：https://arxiv.org/abs/2003.07733
\r\n代码：https://github.com/cleardusk/MFR

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 目标跟踪\r\n\r\n1. ROAM: Recurrently Optimizing Tracking Model
\r\n论文地址：https://arxiv.org/abs/1907.12006

\r\n\r\n

\r\n\r\n \r\n\r\n### 三维点云/三维重建/三维检测/三维分割/深度估计\r\n\r\n* 三维点云&重建\r\n\r\n1. PF-Net: Point Fractal Network for 3D Point Cloud Completion
\r\n论文地址：https://arxiv.org/abs/2003.00410

\r\n\r\n2. PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
\r\n论文地址：https://arxiv.org/abs/2002.10876
\r\n代码：https://github.com/liruihui/PointAugment/

\r\n\r\n3. Learning multiview 3D point cloud registration
\r\n论文地址：https://arxiv.org/abs/2001.05119

\r\n\r\n4. C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds
\r\n论文地址：https://arxiv.org/abs/1912.07009

\r\n\r\n5. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
\r\n论文地址：https://arxiv.org/abs/1911.11236

\r\n\r\n\r\n6. Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
\r\n论文地址：https://arxiv.org/abs/2002.12212

\r\n\r\n7. Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
\r\n论文地址：https://arxiv.org/abs/2003.01456

\r\n\r\n8. In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks
\r\n论文地址：https://arxiv.org/pdf/1911.11924.pdf

\r\n\r\n9. Attentive Context Normalization for Robust Permutation-Equivariant Learning
\r\n论文地址：https://arxiv.org/abs/1907.02545\tWeiwei Sun, Wei Jiang, Eduard Trulls, Andrea Tagliasacchi, Kwang Moo Yi

\r\n\r\n10. PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes\t
\r\n论文地址：https://arxiv.org/abs/1911.10949\t

\r\n\r\n11. SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
\r\n论文地址：https://arxiv.org/abs/1912.00036

\r\n\r\n12. Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
\r\n论文地址：https://arxiv.org/abs/1912.06378
\r\n代码：https://github.com/alibaba/cascade-stereo

\r\n\r\n13. Unsupervised Learning of Intrinsic Structural Representation Points
\r\n论文地址：https://arxiv.org/abs/2003.01661
\r\n代码：https://github.com/NolenChen/3DStructurePoints

\r\n\r\n* 三维重建\r\n1. Leveraging 2D Data to Learn Textured 3D Mesh Generation
\t\r\n论文地址：https://arxiv.org/abs/2004.04180

\r\n\r\n2. ARCH: Animatable Reconstruction of Clothed Humans
\t\r\n论文地址：https://arxiv.org/abs/2004.04572

\r\n\r\n3. Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
\t\r\n论文地址：https://arxiv.org/abs/2004.03967

\r\n\r\n

\r\n\r\n \r\n\r\n### 图像识别\r\n\r\n* 图像特征匹配\r\n1. Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task
\r\n论文地址：https://arxiv.org/abs/1912.00623

\r\n\r\n2. Correspondence Networks with Adaptive Neighbourhood Consensus
\r\n论文地址：https://arxiv.org/abs/2003.12059

\r\n\r\n* 图像字幕\r\n10. Normalized and Geometry-Aware Self-Attention Network for Image Captioning
\r\n论文地址：https://arxiv.org/abs/2003.08897

\r\n\r\n

\r\n\r\n \r\n\r\n### 图像处理\r\n\r\n1. Learning to Shade Hand-drawn Sketches
\r\n论文地址：https://arxiv.org/abs/2002.11812

\r\n\r\n2. Single Image Reflection Removal through Cascaded Refinement
\r\n论文地址：https://arxiv.org/abs/1911.06634

\r\n\r\n3. Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data
\r\n论文地址：https://arxiv.org/abs/2002.11297

\r\n\r\n4. Deep Image Harmonization via Domain Verification
\r\n论文地址：https://arxiv.org/abs/1911.13239
\r\n代码：https://github.com/bcmi/Image_Harmonization_Datasets

\r\n\r\n5. RoutedFusion: Learning Real-time Depth Map Fusion
\r\n论文地址：https://arxiv.org/pdf/2001.04388.pdf

\r\n\r\n6. Neural Contours: Learning to Draw Lines from 3D Shapes
\r\n论文地址：https://arxiv.org/abs/2003.10333

\r\n\r\n7. Towards Photo-Realistic Virtual Try-On by Adaptively Generating鈫Preserving Image Content
\r\n论文地址：https://arxiv.org/abs/2003.05863

\r\n\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 图像分类\r\n\r\n1. Self-training with Noisy Student improves ImageNet classification
\r\n论文地址：https://arxiv.org/abs/1911.04252

\r\n\r\n2. Image Matching across Wide Baselines: From Paper to Practice
\r\n论文地址：https://arxiv.org/abs/2003.01587

\r\n\r\n3. Towards Robust Image Classification Using Sequential Attention Models
\r\n论文地址：https://arxiv.org/abs/1912.02184

\r\n\r\n4. Learning in the Frequency Domain\t
\r\n论文地址：https://arxiv.org/abs/2002.12416

\r\n\r\n5. Learning from Web Data with Memory Module\t
\r\n论文地址：https://arxiv.org/abs/1906.12028

\r\n\r\n6. Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks\t
\r\n论文地址：https://arxiv.org/abs/1912.09393

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 图像分割\r\n\r\n1. Semi-Supervised Semantic Image Segmentation with Self-correcting Networks
\r\n论文地址：https://arxiv.org/abs/1811.07073

\r\n\r\n2. Deep Snake for Real-Time Instance Segmentation
\r\n论文地址：https://arxiv.org/abs/2001.01629

\r\n\r\n3. CenterMask : Real-Time Anchor-Free Instance Segmentation
\r\n论文地址：https://arxiv.org/abs/1911.06667
\r\n代码：https://github.com/youngwanLEE/CenterMask

\r\n\r\n4. SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks
\r\n论文地址：https://arxiv.org/abs/2003.00678

\r\n\r\n5. PolarMask: Single Shot Instance Segmentation with Polar Representation
\r\n论文地址：https://arxiv.org/abs/1909.13226
\r\n代码：https://github.com/xieenze/PolarMask

\r\n\r\n\r\n6. xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
\r\n论文地址：https://arxiv.org/abs/1911.12676

\r\n\r\n7. BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
\r\n论文地址：https://arxiv.org/abs/2001.00309

\r\n\r\n8. Enhancing Generic Segmentation with Learned Region Representations
\r\n论文地址：https://arxiv.org/abs/1911.08564

\r\n\r\n

\r\n\r\n \r\n\r\n### 姿态估计/动作识别\r\n\r\n1. VIBE: Video Inference for Human Body Pose and Shape Estimation
\r\n论文地址：https://arxiv.org/abs/1912.05656
\r\n代码：https://github.com/mkocabas/VIBE

\r\n\r\n2. Distribution-Aware Coordinate Representation for Human Pose Estimation
\r\n论文地址：https://arxiv.org/abs/1910.06278
\r\n代码：https://github.com/ilovepose/DarkPose

\r\n\r\n3. 4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
\r\n论文地址：https://arxiv.org/abs/2002.12625

\r\n\r\n4. Optimal least-squares solution to the hand-eye calibration problem
\r\n论文地址：https://arxiv.org/abs/2002.10838

\r\n\r\n5. D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
\r\n论文地址：https://arxiv.org/abs/2003.01060

\r\n\r\n6. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
\r\n论文地址：https://arxiv.org/abs/2001.09691

\r\n\r\n7. Distribution Aware Coordinate Representation for Human Pose Estimation
\r\n论文地址：https://arxiv.org/abs/1910.06278

\r\n\r\n8. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
\r\n论文地址：https://arxiv.org/abs/1911.07524

\r\n\r\n9. PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
\r\n论文地址：https://arxiv.org/abs/1911.04231

\r\n\r\n10. Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
\r\n论文地址：https://arxiv.org/abs/2003.02824

\r\n\r\n11. G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
\r\n论文地址：https://arxiv.org/abs/2003.11089

\r\n\r\n12. Deep Image Spatial Transformation for Person Image Generation
\r\n论文地址：https://arxiv.org/abs/2003.00696
\r\n代码：https://github.com/RenYurui/ Global-Flow-Local-Attention

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 视频分析\r\n\r\n1. Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
\r\n论文地址：https://arxiv.org/abs/2003.01455
\r\n代码：https://github.com/bbrattoli/ZeroShotVideoClassification

\r\n\r\n2. Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
\r\n论文地址：https://arxiv.org/abs/2003.00387

\r\n\r\n3. Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
\r\n论文地址：https://arxiv.org/abs/2003.00392

\r\n\r\n4. Object Relational Graph with Teacher-Recommended Learning for Video Captioning
\r\n论文地址：https://arxiv.org/abs/2002.11566

\r\n\r\n5. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
\r\n论文地址：https://arxiv.org/abs/2002.11616

\r\n\r\n6. Blurry Video Frame Interpolation
\r\n论文地址：https://arxiv.org/abs/2002.12259

\r\n\r\n7. Hierarchical Conditional Relation Networks for Video Question Answering
\r\n论文地址：https://arxiv.org/abs/2002.10698

\r\n\r\n8. Action Modifiers:Learning from Adverbs in Instructional Video
\r\n论文地址：https://arxiv.org/abs/1912.06617

\r\n\r\n9. Visual Grounding in Video for Unsupervised Word Translation
\r\n论文地址：https://arxiv.org/abs/2003.05078
\r\n代码：https://github.com/gsig/visual-grounding

\r\n\r\n10. MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask（视频分析-光流估计）
\r\n论文地址：https://arxiv.org/abs/2003.10955
\r\n代码：https://github.com/microsoft/MaskFlownet

\r\n\r\n11. Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects（视频预测）
\r\n论文地址：https://arxiv.org/abs/2003.12045
\r\n代码：https://ehsanik.github.io/forcecvpr2020

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### OCR\r\n\r\n1. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
\r\n论文地址：https://arxiv.org/abs/2002.10200
\r\n代码：https://github.com/Yuliang-Liu/bezier_curve_text_spotting,https://github.com/aim-uofa/adet

\r\n\r\n2. Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
\r\n论文地址：https://arxiv.org/abs/1911.06258

\r\n\r\n

\r\n\r\n \r\n\r\n### GAN\r\n\r\n1. Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
\r\n论文地址：https://arxiv.org/abs/1911.12287
\r\n代码：https://github.com/giannisdaras/ylg

\r\n\r\n2. MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis
\r\n论文地址：https://arxiv.org/abs/1903.06048

\r\n\r\n3. Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory
\r\n论文地址：https://arxiv.org/abs/1911.04636

\r\n\r\n4. PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
\r\n论文地址：https://arxiv.org/abs/1909.06956

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 小样本/零样本\r\n\r\n1. Improved Few-Shot Visual Classification
\r\n论文地址：https://arxiv.org/pdf/1912.03432.pdf

\r\n\r\n2. Meta-Transfer Learning for Zero-Shot Super-Resolution
\r\n论文地址：https://arxiv.org/abs/2002.12213

\r\n\r\n3. Instance Credibility Inference for Few-Shot Learning
\r\n论文地址：https://arxiv.org/abs/2003.11853
\r\n代码：https://github.com/Yikai-Wang/ICI-FSL

\r\n\r\n

\r\n\r\n \r\n\r\n### 弱监督/无监督/自监督\r\n\r\n1. Rethinking the Route Towards Weakly Supervised Object Localization
\r\n论文地址：https://arxiv.org/abs/2002.11359

\r\n\r\n2. NestedVAE: Isolating Common Factors via Weak Supervision
\r\n论文地址：https://arxiv.org/abs/2002.11576

\r\n\r\n3. Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
\r\n论文地址：https://arxiv.org/abs/1911.07450

\r\n\r\n4. Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
\r\n论文地址：https://arxiv.org/abs/2003.01460

\r\n\r\n5. ClusterFit: Improving Generalization of Visual Representations
\r\n论文地址：https://arxiv.org/abs/1912.03330

\r\n\r\n6. Auto-Encoding Twin-Bottleneck Hashing
\r\n论文地址：https://arxiv.org/abs/2002.11930

\r\n\r\n7. Learning Representations by Predicting Bags of Visual Words
\r\n论文地址：https://arxiv.org/abs/2002.12247

\r\n\r\n8. A Characteristic Function Approach to Deep Implicit Generative Modeling
\r\n论文地址：https://arxiv.org/abs/1909.07425

\r\n\r\n9. Unsupervised Learning of Intrinsic Structural Representation Points
\r\n论文地址：https://arxiv.org/abs/2003.01661
\r\n代码：https://github.com/NolenChen/3DStructurePoints

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n### 行人跟踪/行人检测/ReID\r\n1. Cross-modality Person re-identification with Shared-Specific Feature Transfer\t
\r\n论文地址：https://arxiv.org/abs/2002.12489

\r\n\r\n2. Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction\t
\r\n论文地址：https://arxiv.org/abs/2002.11927

\r\n\r\n3. The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction\t
\r\n论文地址：https://arxiv.org/abs/1912.06445

\r\n\r\n

\r\n\r\n \r\n\r\n### 神经网络/模型压缩/模型加速\r\n\r\n1. GhostNet: More Features from Cheap Operations
\r\n论文地址：https://arxiv.org/abs/1911.11907
\r\n代码：https://github.com/iamhankai/ghostnet

\r\n\r\n2. Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral
\r\n论文地址：https://arxiv.org/abs/2003.01826

\r\n\r\n3. GPU-Accelerated Mobile Multi-view Style Transfer
\r\n论文地址：https://arxiv.org/abs/2003.00706

\r\n\r\n4. Bundle Adjustment on a Graph Processor\t\t
\r\n论文地址：https://arxiv.org/abs/2003.03134\t\t
\r\n代码：https://github.com/joeaortiz/gbp

\r\n\r\n5. Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral \t
\r\n论文地址：https://arxiv.org/abs/2003.01826\t

\r\n\r\n6. Holistically-Attracted Wireframe Parsing\t
\r\n论文地址：https://arxiv.org/abs/2003.01663\t

\r\n\r\n7. AdderNet: Do We Really Need Multiplications in Deep Learning? \t
\r\n论文地址：https://arxiv.org/abs/1912.13200 \t

\r\n\r\n8. CARS: Contunuous Evolution for Efficient Neural Architecture Search\t
\r\n论文地址：https://arxiv.org/abs/1909.04977\t \t
\r\n代码：https://github.com/huawei-noah/CARS

\r\n\r\n9. Π-nets: Deep Polynomial Neural Networksv
\r\n论文地址：https://arxiv.org/abs/2003.03828

\r\n\r\n10. Explaining Knowledge Distillation by Quantifying the Knowledge
\r\n论文地址：https://arxiv.org/abs/2003.03622

\r\n\r\n

\r\n\r\n \r\n\r\n### 超分辨率\r\n\r\n1. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution\t
\r\n论文地址：https://arxiv.org/abs/2002.11616

\r\n\r\n2. Closed-loop Matters: Dual Regression Networks for Single Image Super-Resolution
\r\n论文地址：https://arxiv.org/abs/2003.07018
\r\n代码：https://github.com/guoyongcs/DRN

\r\n\r\n

\r\n\r\n \r\n\r\n### 视觉常识/其他\r\n\r\n1. Visual Commonsense R-CNN
\r\n论文地址：https://arxiv.org/abs/2002.12204
\r\n代码：https://github.com/Wangt-CN/VC-R-CNN

\r\n\r\n2. Scalable Uncertainty for Computer Vision with Functional Variational Inference\t
\r\n论文地址：https://arxiv.org/abs/2003.03396

\r\n\r\n3. Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective\t
\r\n论文地址：https://arxiv.org/abs/2002.10826

\r\n\r\n4. Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs\t
\r\n论文地址：https://arxiv.org/abs/2003.00287\t

\r\n\t\t\t\t\r\n5. Filter Grafting for Deep Neural Networks\t
\r\n论文地址：https://arxiv.org/abs/2001.05868
\r\n代码：https://github.com/fxmeng/filter-grafting.git

\r\n\r\n6. 12-in-1: Multi-Task Vision and Language Representation Learning
\r\n论文地址：https://arxiv.org/abs/1912.02315 \t

\r\n\t\t\t\t\r\n7. Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training\t
\r\n论文地址：https://arxiv.org/abs/2002.10638\t\t
\r\n代码：https://github.com/weituo12321/PREVALENT

\r\n\r\n8. Unbiased Scene Graph Generation from Biased Training\t
\r\n论文地址：https://arxiv.org/abs/2002.11949

\r\n\r\n9.Towards Visually Explaining Variational Autoencoders
\r\n论文地址：https://arxiv.org/abs/1911.07389

\r\n\r\n10. BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
\r\n论文地址：http://www.weixiushen.com/publication/cvpr20_BBN.pdf
\r\n代码：https://github.com/Megvii-Nanjing/BBN

\r\n\r\n11. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks
\r\n论文地址：https://arxiv.org/abs/1905.13545

\r\n\r\n12. SAM: The Sensitivity of Attribution Methods to Hyperparameters
\r\n论文地址：http://s.anhnguyen.me/sam\\_cvpr2020.pdf
\r\n代码：https://github.com/anguyen8/sam

\r\n\r\n13. Π− nets: Deep Polynomial Neural Networks
\r\n论文地址：https://arxiv.org/abs/2003.03828

\r\n\r\n14. Towards Backward-Compatible Representation Learning
\r\n论文地址：https://arxiv.org/abs/2003.11942

\r\n\r\n15. On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
\r\n论文地址：https://arxiv.org/abs/2003.07064

\r\n\r\n16. KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations（数据集）
\r\n论文地址：https://arxiv.org/abs/2002.12687

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n# 2.CVPR2020 Oral（持续更新）
\r\n[1. PolarMask: Single Shot Instance Segmentation with Polar Representation](https://arxiv.org/abs/1909.13226)
\r\n代码：https://github.com/xieenze/PolarMask

\r\n\r\n[2. Unbiased Scene Graph Generation from Biased Training](https://arxiv.org/abs/2002.11949)
\r\n代码：https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

\r\n\r\n[3. Learning to Shade Hand-drawn Sketches](https://arxiv.org/abs/2002.11812)
\r\n代码：https://github.com/qyzdao/ShadeSketch

\r\n\r\n[4. SAM: The Sensitivity of Attribution Methods to Hyperparameters](http://s.anhnguyen.me/sam_cvpr2020.pdf)
\r\n代码：https://github.com/anguyen8/sam

\r\n\r\n[5. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks](https://arxiv.org/abs/1905.13545)

\r\n\r\n[6. Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task](https://arxiv.org/abs/1912.00623)

\r\n\r\n[7. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds](https://arxiv.org/abs/1911.11236)

\r\n\r\n[8. AdderNet: Do We Really Need Multiplications in Deep Learning? ](https://arxiv.org/abs/1912.13200 )

\r\n\r\n[9. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition](https://arxiv.org/abs/2001.09691 )

\r\n\r\n[10. Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation](https://arxiv.org/abs/2003.08813)

\r\n\r\n[11. Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing](https://arxiv.org/abs/2003.08061)
\r\nhttps://github.com/clks-wzz/FAS-SGTD

\r\n\r\n[12. Learning Meta Face Recognition in Unseen Domains](https://arxiv.org/abs/2003.07733)
\r\nhttps://github.com/cleardusk/MFR

\r\n\r\n[13. Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching](https://arxiv.org/abs/1912.06378)
\r\nhttps://github.com/alibaba/cascade-stereo

\r\n\r\n[14. BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition](http://www.weixiushen.com/publication/cvpr20_BBN.pdf)
\r\nhttps://github.com/Megvii-Nanjing/BBN

\r\n\r\n[15. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks](https://arxiv.org/abs/1905.13545)

\r\n\r\n[16. SAM: The Sensitivity of Attribution Methods to Hyperparameters](http://s.anhnguyen.me/sam\\_cvpr2020.pdf)
\r\nhttps://github.com/anguyen8/sam

\r\n\r\n[17. Towards Backward-Compatible Representation Learning](https://arxiv.org/abs/2003.11942)

\r\n\r\n[18. MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask](https://arxiv.org/abs/2003.10955)
\r\nhttps://github.com/microsoft/MaskFlownet

\r\n\r\n[19. Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects](https://arxiv.org/abs/2003.12045)
\r\nhttps://ehsanik.github.io/forcecvpr2020

\r\n\r\n[20. StyleRig: Rigging StyleGAN for 3D Control over Portrait Images](https://arxiv.org/abs/2004.00121)

\r\n\r\n[21. Conditional Channel Gated Networks for Task-Aware Continual Learning](https://arxiv.org/abs/2004.00070)\r\n\r\n[22. BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation](https://arxiv.org/abs/2003.14031)

\r\n\r\n[23. TITAN: Future Forecast using Action Priors](https://arxiv.org/abs/2003.13886)

\r\n\r\n[24. Learning Interactions and Relationships between Movie Characters](https://arxiv.org/abs/2003.13158)

\r\n\r\n[25. GPS-Net: Graph Property Sensing Network for Scene Graph Generation](https://arxiv.org/abs/2003.12962)
\r\nhttps://github.com/taksau/GPS-Net

\r\n\r\n[26. A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising](https://arxiv.org/abs/2003.12751)
\r\nhttps://github.com/Vandermode/NoiseModel

\r\n\r\n[27. Controllable Person Image Synthesis with Attribute-Decomposed GAN](https://arxiv.org/abs/2003.12267)
\r\nhttps://menyifang.github.io/projects/ADGAN/ADGAN.html

\r\n\r\n[28. Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations](https://arxiv.org/abs/2003.12237)

\r\n\r\n[29. Learning to Optimize Non-Rigid Tracking](https://arxiv.org/abs/2003.12230)

\r\n\r\n[30. Self-Supervised Scene De-occlusion](https://arxiv.org/abs/2004.02788)
\r\nhttps://xiaohangzhan.github.io/projects/deocclusion/

\r\n\r\n[31. Robust 3D Self-portraits in Seconds](https://arxiv.org/abs/2004.02460)

\r\n\r\n[32. Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics](https://arxiv.org/abs/2004.02331)

\r\n\r\n[33. Light Field Spatial Super-resolution via Deep Combinatorial Geometry Embedding and Structural Consistency Regularization](https://arxiv.org/abs/2004.02215)

\r\n\r\n[34. Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval](https://arxiv.org/abs/2004.01804)

\r\n\r\n[35. Deep White-Balance Editing](https://arxiv.org/abs/2004.01354)

\r\n\r\n[36. Tracking by Instance Detection: A Meta-Learning Approach](https://arxiv.org/abs/2004.00830)

\r\n\r\n\r\n\r\n

\r\n\r\n \r\n\r\n# 3.CVPR2020 论文解读

\r\n\r\n\r\n### [15.无监督的视觉常识特征学习——因果关系上的一点探索](https://zhuanlan.zhihu.com/p/111306353)
\r\n如今越来越多的研究者开始关注如何将统计中的因果应用于deep learning，来增加其鲁棒性、可解释性等等。但是大部分工作都没有深入因果理论中，更多的是借用了其中一些概念（比如counterfactual反事实），这篇paper旨在能在此基础上再向前走一点。
\r\n论文链接：https://arxiv.org/abs/2002.12204
\r\n论文代码：https://github.com/Wangt-CN/VC-R-CNN

\r\n\r\n### [14.CVPR2020 | 最新最完善的场景图生成 (SGG)开源框架，集成目前最全metrics，已开源](https://mp.weixin.qq.com/s/Nj6GjpRG8qG1ihhcoY9SwQ)
\r\n选择2019年热门框架facebookresearch/maskrcnn-benchmark作为基础，在其基础上搭建了Scene-Graph-Benchmark.pytorch。该代码不仅兼容了maskrcnn-benchmark所支持的所有detector模型，且得益于facebookresearch优秀的代码功底，更大大增加了SGG部分的可读性和可操作性。
\r\n论文链接：https://arxiv.org/abs/2002.11949
\r\n论文代码：https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

\r\n\r\n\r\n### [13.CVPR2020 | 旷视研究院提出基于3D关键点投票网络的单目6DoF位姿估计算法(已开源)](https://mp.weixin.qq.com/s/c8rQYj5lSOtI1iza9e0Dpw)
\r\n论文链接：https://arxiv.org/abs/1911.04231
\r\n论文代码：https://github.com/ethnhe/PVN3D.git
\r\n旷视研究院提出一种基于霍夫投票（Hough voting)的 3D 关键点检测神经网络，称之为 PVN3D，以学习逐点到 3D 关键点的偏移并为 3D 关键点投票。把基于 2D 关键点的方法推进至 3D 关键点，以充分利用刚体的几何约束信息，极大提升了 6DoF 估计的精确性。在 YCB-Video 和 LineMOD 两大公开数据集上进行了评估实验，结果表明该方法以大幅优势取得了当前最佳性能。\r\n

\r\n\r\n\r\n### [12.跨模态行人重识别：共享与特异特征变换算法cm-SSFT](https://mp.weixin.qq.com/s/qPc71o2JeMDpDgRxDtp2BA)
\r\n论文链接：https://arxiv.org/abs/2002.12489
\r\n关注红外线-RGB跨模态行人重识别。试图解决：以往大部分跨模态行人重识别算法一般都只关注shared feature learning，而很少关注Specific feature。因为Specific feature在对面模态中是不存在的。例如在红外线图片中是没有彩色颜色信息的。反之在彩图中也不会有热度信息。而实际上做过ReID的都知道，传统ReID之所以性能很高，很大程度上就是有些“过拟合”到了这些specific信息上。比如衣服颜色一直是传统ReID的一个重要的cue。从这个角度出发，尝试利用specific特征。主要思路是利用近邻信息：给定一红外线query。当搜索彩色target时，可以先找到一些简单的置信度高的彩色样本（这些样本大概率是红外线query的positive样本），把这些彩色样本的颜色特异特征给与红外线query。做了这件事后，红外线query样本可以利用这些彩色信息再去搜索更难的彩色样本。

\r\n\r\n### [11.RandLA-Net:大场景三维点云语义分割新框架（已开源）](https://mp.weixin.qq.com/s/xuLJ8m_ipGVBXVduA7Y0IA)
\r\n论文链接：https://arxiv.org/abs/1911.11236
\r\n代码：https://github.com/QingyongHu/RandLA-Net
\r\n提出了一种基于简单高效的随机降采样和局部特征聚合的网络结构(RandLA-Net)。该方法不仅在诸如Semantic3D和SemanticKITTI等大场景点云分割数据集上取得了非常好的效果，并且具有非常高的效率(e.g. 比基于图的方法SPG快了接近200倍)。

\r\n\r\n\r\n### [10.腾讯推出超强少样本目标检测算法，公开千类少样本检测训练集FSOD](https://mp.weixin.qq.com/s/TRRsBGzMir0ttzjTdXwJCw)
\r\n论文链接：https://arxiv.org/abs/1908.01998
\r\n提出了新的少样本目标检测算法，创新点包括Attention-RPN、多关系检测器以及对比训练策略，另外还构建了包含1000类的少样本检测数据集FSOD，在FSOD上训练得到的论文模型能够直接迁移到新类别的检测中，不需要fine-tune

\r\n\r\n### [9.CARS: 华为提出基于进化算法和权值共享的神经网络结构搜索，CIFAR-10上仅需单卡半天](https://mp.weixin.qq.com/s/GAL-hbERLp6vS2zB_I9jxg)\r\n
\r\n论文链接：https://arxiv.org/abs/1909.04977
\r\n为了优化进化算法在神经网络结构搜索时候选网络训练过长的问题，参考ENAS和NSGA-III，论文提出连续进化结构搜索方法(continuous evolution architecture search, CARS)，最大化利用学习到的知识，如上一轮进化的结构和参数。首先构造用于参数共享的超网，从超网中产生子网，然后使用None-dominated排序策略来选择不同大小的优秀网络，整体耗时仅需要0.5 GPU day。

\r\n\r\n### [8.化繁为简，弱监督目标定位领域的新SOTA - 伪监督目标定位方法(PSOL)](https://mp.weixin.qq.com/s/6G7BG8DrKZ0Zvi-BUqg78w)
\r\n\r\n论文链接：https://arxiv.org/abs/2002.11359
\r\n论文提出伪监督目标定位方法(PSOL)来解决目前弱监督目标定位方法的问题，该方法将定位与分类分开成两个独立的网络，然后在训练集上使用Deep descriptor transformation(DDT)生成伪GT进行训练，整体效果达到SOTA。该论文主要有三点贡献：一、弱监督目标定位应该分为类不可知目标定位和目标分类两个独立的部分，提出PSOL算法；二、尽管生成的bbox有偏差，论文仍然认为应该直接优化他们而不需要类标签，最终达到SOTA；三、在不同的数据集上，PSOL算法不需要fine-tuning也能有很好的定位迁移能力

\r\n\r\n\r\n### [7.字节跳动：基于解剖学感知的视频3D人体姿态估计](https://mp.weixin.qq.com/s/ut8CmEZPc3NMDdlgXfUzGg)
\r\n\r\n论文链接：https://arxiv.org/pdf/2002.10322.pdf
\r\n在这项工作中，我们提出了一种新的视频中3D人体姿态估计的解决方案。我们不是直接回归3D关节位置，而是从人体骨骼解剖中汲取灵感，将任务分解为骨骼方向预测和骨骼长度预测，从这两个预测中完全可以得到三维关节位置。我们的研究动机是人类骨骼的长度随着时间的推移保持一致。这推动了我们开发有效的技术来利用视频中所有帧的全局信息来进行高精度的骨骼长度预测。此外，对于骨骼方向预测网络，我们提出了一种具有长跳跃连接的全卷积传播结构。本质上，它分层地预测不同骨骼的方向，而不使用任何耗时的存储单元(例如LSTM)。进一步引入了一种新的关节位移损失来连接骨骼长度和骨骼方向预测网络的训练。最后，我们采用一种隐含的注意机制将2D关键点可见性分数作为额外的指导反馈到模型中，这显著地缓解了许多具有挑战性的姿势中的深度歧义。我们的完整模型在Human3.6M和MPI-INF-3dHP数据集上的表现优于之前的最好结果，在这些数据集上的综合评估验证了我们模型的有效性。

\r\n\r\n\r\n### [6.微软亚洲研究院：给Deepfake 假脸做 X-Ray，新模型把换脸图打回原形](https://mp.weixin.qq.com/s/DLxqGFm6IRffPa8A0XBc4w)
\r\n\r\n论文链接：论文地址：https://arxiv.org/pdf/1912.13458.pdf
\r\n微软亚洲研究院提出了一个方法，它既不需要了解换脸后的图像数据，也不需要知道换脸算法，就能对图像做『X-Ray』，鉴别出是否换脸，以及指出换脸的边界。\r\n新模型 Face X-Ray 具有两大属性：能泛化到未知换脸算法、能提供可解释的换脸边界。要获得这样的优良属性，诀窍就藏在换脸算法的一般过程中。如下所示，大多数换脸算法可以分为检测、修改以及融合三部分。与之前的研究不同，Face X-Ray 希望检测第三阶段产生的误差。

\r\n\r\n### [5.UDP：人体姿态估计中的无偏数据处理方法](https://mp.weixin.qq.com/s/J1Y0tSIpfTOZ4J-9PPyhag)
\r\n\r\n论文链接：https://arxiv.org/abs/1911.07524
\r\nUDP，解决了现有的SOTA人体姿态估计算法中标准编解码方法存在较大统计误差的问题。同时解决了由于翻转测试而导致的结果不对齐问题。且该算法即用即插，在基本不增加模型复杂度的情况下，有效提升了算法性能。

\r\n\r\n### [4.让合成图像更真实，上交大提出基于域验证的图像和谐化](https://mp.weixin.qq.com/s/JgQ7bgc_bfgWE-PmJMKtOA)
\r\n\r\n论文链接：https://arxiv.org/abs/1911.13239
\r\n在合成图中，前景和背景是在不同的拍摄条件 (比如时刻、季节、光照、天气) 下拍摄的，所以在亮度色泽等方面存在明显的不匹配问题。图像和谐化 (image harmonization) 旨在调整合成图中的前景，使其与背景和谐。传统的图像和谐化方法一般是从背景或者其他图片转移颜色信息到前景上，但这样无法保证调整之后的前景看起来真实并且与背景和谐。近年来，已经有少量的工作尝试用深度学习做图像和谐化，但成对的合成图和真实图极难获得。如果没有成对的合成图和真实图，深度学习的训练过程缺乏足够强的监督信息，合成图和谐化之后的结果也没有 ground-truth 用于评测。截至目前还没有公开的大规模图像和谐化数据库，我们**构建并公布了由四个子数据库组成的图像和谐化数据库。并且，我们提出了域验证 (domain verification) 的概念，尝试了基于域验证的图像和谐化算法。**

\r\n\r\n### [3.PolarMask: 一阶段实例分割新思路](https://zhuanlan.zhihu.com/p/84890413)
\r\n\r\n论文链接：https://arxiv.org/abs/1909.13226
\r\nPolarMask基于FCOS，把实例分割统一到了FCN的框架下。FCOS本质上是一种FCN的dense prediction的检测框架，可以在性能上不输anchor based的目标检测方法，让行业看到了anchor free方法的潜力。接下来要解决的问题是实例分割。本工作最大的贡献在于把更复杂的实例分割问题，转化成在网络设计和计算量复杂度上和物体检测一样复杂的任务，把对实例分割的建模变得简单和高效。

\r\n\r\n### [2.华为GhostNet，超越谷歌MobileNet，已开源](https://mp.weixin.qq.com/s/Wg_BQpo_3K_fumeelDvUxA)
\r\n\r\n论文链接：https://arxiv.org/abs/1911.11907
\r\n该论文提供了一个全新的Ghost模块，旨在通过廉价操作生成更多的特征图。基于一组原始的特征图，作者应用一系列线性变换，以很小的代价生成许多能从原始特征发掘所需信息的“幻影”特征图（Ghost feature maps）。该Ghost模块即插即用，通过堆叠Ghost模块得出Ghost bottleneck，进而搭建轻量级神经网络——GhostNet。在ImageNet分类任务，GhostNet在相似计算量情况下Top-1正确率达75.7%，高于MobileNetV3的75.2%。

\r\n\r\n### [1.加州理工大学Devi Parikh：多任务视觉和语言表示学习](https://mp.weixin.qq.com/s/8CvUT9JvnysIXay7vyY16w)
\r\n\r\n论文链接：https://arxiv.org/abs/1912.02315
\r\n许多视觉和语言的研究集中在一组小而多样的独立任务和支持的数据集上，这些数据集通常是单独研究的;然而，成功完成这些任务所需的视觉语言理解技能有很大的重叠。在这项工作中，我们通过开发一个大规模的、多任务的训练机制来研究视觉和语言任务之间的关系。

\r\n\r\n\r\n

\r\n\r\n \r\n\r\n# 4.To do list
\r\n* CVPR2020复现代码及时更新
\r\n* CVPR2020论文分享跟进
\r\n\r\n

\r\n\r\n \r\n\r\n# 5.Related links
\r\n* [CVPR2019/2018/2017最全资料下载（论文／代码等)](https://github.com/extreme-assistant/cvpr2020/blob/master/README.md)
\r\n* https://github.com/extreme-assistant/iccv2019

\r\n\r\n\r\n# 6.CVPR2020 contributors Wechat Group
\r\n为了让大家更好得进行交流，极市特别组建了贡献者群及作者微信群，欢迎加小助手微信（cv-mart，备注CVPR2020）进群。\r\n" }, { "path": "CVPR2021.md", "content": "* 推荐阅读：
\n * [ICCV2021/2019/2017 论文/代码/解读/直播合集](https://github.com/extreme-assistant/ICCV2021-Paper-Code-Interpretation)\n * [2020-2021年计算机视觉综述论文汇总](https://github.com/extreme-assistant/survey-computer-vision)\n * [国内外优秀的计算机视觉团队汇总](https://github.com/extreme-assistant/Awesome-CV-Team)\n------\n\n# CVPR2021最新信息及论文下载（Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等）\n\n官网链接：http://cvpr2021.thecvf.com
\n时间：2021年6月19日-6月25日
\n论文接收公布时间：2021年2月28日
\n\n相关问题：
\n\n* [如何评价CVPR 2021的论文接收结果？](https://www.zhihu.com/question/446299297)
\n* [CVPR 2021接收结果出炉！录用1663篇，接受率提升，你的论文中了吗？（附论文下载）](https://mp.weixin.qq.com/s/4UQ2W1V-eLnL02L8BDOtMg)\n\n

\n\n# 目录\n\n[1. CVPR2021接受论文/代码分方向汇总（更新中）](#1)
\n[2. CVPR2021 Oral（更新中）](#2)
\n[3. CVPR2021论文解读汇总（更新中）](#3)
\n[4. CVPR2021 Workshop](#4)
\n[5. To do list](#4)
\n\n\n
\n\n \n\n# 1.CVPR2021接受论文/代码分方向整理(持续更新)\n\n\n## 分类目录：\n\n### [1. 检测](#detection)\n* [2D目标检测(2D Object Detection)](#IOD)\n * [一文看尽CVPR2021 2D 目标检测论文（27篇）](https://bbs.cvmart.net/articles/5120)\n* [视频目标检测(Video Object Detection)](#VOD)\n* [3D目标检测(3D Object Detection)](#3DOD)\n* [人物交互检测(HOI Detection)](#HOI)\n* [伪装目标检测(Camouflaged Object Detection)](#COD)\n* [旋转目标检测(Rotation Object Detection)](#ROD)\n* [显著性目标检测(Saliency Object Detection)](#SOD)\n* [图像异常检测(Anomally Detection in Image)](#ADI)\n * [一文看尽CVPR2021 图像异常检测论文（6篇）](https://bbs.cvmart.net/articles/5127)\n* [关键点检测(Keypoint Detection)](#KeypointDetection)\n\n### [2. 分割(Segmentation)](#Segmentation)\n* [图像分割(Image Segmentation)](#ImageSegmentation)\n* [全景分割(Panoptic Segmentation)](#PanopticSegmentation)\n* [语义分割(Semantic Segmentation)](#SemanticSegmentation)\n* [实例分割(Instance Segmentation)](#InstanceSegmentation)\n* [超像素(Superpixel)](#Superpixel)\n* [视频目标分割(Video Object Segmentation)](#VOS)\n* [抠图(Matting)](#Matting)\n* [密集预测(Dense Prediction)](#DensePrediction)\n\n### [3. 图像处理(Image Processing)](#ImageProcessing)\n\n* [超分辨率(Super Resolution)](#SuperResolution)\n* [图像复原/图像增强(Image Restoration)](#ImageRestoration)\n* [图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)](#ISR)\n* [图像去噪/去模糊/去雨去雾(Image Denoising)](#ImageDenoising)\n* [图像编辑/修复(Image Edit/Image Inpainting)](#ImageEdit)\n* [图像翻译(Image Translation)](#ImageTranslation)\n* [图像质量评估(Image Quality Assessment)](#IQA)\n* [风格迁移(Style Transfer)](#StyleTransfer)\n\n### [4. 估计(Estimation)](#Estimation)\n* [姿态估计(Pose Estimation)](#HumanPoseEstimation)\n* [手势估计(Gesture Estimation)](#GestureEstimation)\n* [光流/位姿/运动估计(Flow/Pose/Motion Estimation)](#Flow/Pose/MotionEstimation)\n* [深度估计(Depth Estimation)](#DepthEstimation)\n\n### [5. 图像&视频检索/理解(Image&Video Retrieval/Video Understanding)](#ImageRetrieval)\n* [行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)](#ActionRecognition)\n* [行人重识别/检测(Re-Identification/Detection)](#Re-Identification)\n* [图像/视频字幕(Image/Video Caption)](#VideoCaption)\n\n### [6. 人脸(Face)](#Face)\n* [人脸识别/检测(Facial Recognition/Detection)](#FacialRecognition)\n* [人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)](#FaceSynthesis)\n* [人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)](#FaceAnti-Spoofing)\n\n### [7. 三维视觉(3D Vision)](#3DVision)\n* [点云(Point Cloud)](#3DPC)\n* [三维重建(3D Reconstruction)](#3DReconstruction)\n\n### [8. 目标跟踪(Object Tracking)](#ObjectTracking)\n\n### [9. 医学影像(Medical Imaging)](#MedicalImaging)\n\n### [10. 文本检测/识别(Text Detection/Recognition)](#TDR)\n\n### [11. 遥感图像(Remote Sensing Image)](#RSI)\n\n### [12. GAN/生成式/对抗式(GAN/Generative/Adversarial)](#GAN)\n\n### [13. 图像生成/合成(Image Generation/Image Synthesis)](#IGIS)\n* [视图合成(View Synthesis)](#ViewSynthesis)\n\n### [14. 场景图(Scene Graph](#SG)\n* [场景图生成(Scene Graph Generation)](#SGG)\n* [场景图预测(Scene Graph Prediction)](#SGP)\n* [场景图理解(Scene Graph Understanding)](#SGU)\n\n### [15. 视觉定位(Visual Localization)](#VisualLocalization)\n\n### [16. 视觉推理/视觉问答(Visual Reasoning/VQA)](#VisualReasoning)\n\n### [17. 图像分类(Image Classification)](#ImageClassification)\n\n### [18. 神经网络结构设计(Neural Network Structure Design)](#NNS)\n* [Transformer](#Transformer)\n* [图神经网络(GNN)](#GNN)\n* [神经网络架构搜索(NAS)](#NAS)\n\n### [19. 模型压缩(Model Compression)](#ModelCompression)\n* [知识蒸馏(Knowledge Distillation)](#KnowledgeDistillation)\n* [剪枝(Pruning)](#Pruning)\n* [量化(Quantization)](#Quantization)\n\n### [20. 模型训练/泛化(Model Training/Generalization)](#ModelTraining)\n* [噪声标签(Noisy Label)](#NoisyLabel)\n* [长尾分布(Long-Tailed Distribution)](#Long-Tailed)\n\n### [21. 模型评估(Model Evaluation)](#ModelEvaluation)\n\n### [22. 数据处理(Data Processing)](#DataProcessing)\n* [数据增广(Data Augmentation)](#DataAugmentation)\n* [表征学习(Representation Learning)](#RepresentationLearning)\n* [归一化/正则化(Batch Normalization)](#BatchNormalization)\n* [图像聚类(Image Clustering)](#ImageClustering)\n* [图像压缩(Image Compression)](#ImageCompression)\n* [异常检测(Anomaly Detection)](#AnomalyDetection)\n\n### [23. 主动学习(Active Learning)](#ActiveLearning)\n\n### [24. 小样本学习/零样本学习(Few-shot/Zero-shot Learning)](#Few-shotLearning)\n\n### [25. 持续学习(Continual Learning/Life-long Learning)](#ContinualLearning)\n\n### [26. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)](#domain)\n\n### [27. 度量学习(Metric Learning)](#MetricLearning)\n\n### [28. 对比学习(Contrastive Learning)](#ContrastiveLearning)\n\n### [29. 增量学习(Incremental Learning)](#IncrementalLearning)\n\n### [30. 强化学习(Reinforcement Learning)](#RL)\n\n### [31. 元学习(Meta Learning)](#MetaLearning)\n\n### [32. 多模态学习(Multi-Modal Learning)](#MMLearning)\n* [视听学习(Audio-visual Learning)](#Audio-VisualLearning)\n\n### [33. 视觉预测(Vision-based Prediction)](#Vision-basedPrediction)\n\n### [34. 数据集(Dataset)](#Dataset)\n\n### [暂无分类](#100)\n\n\n\n

\n\n \n\n## 检测\n\n\n \n\n### 图像目标检测(2D Object Detection)\n\n[28] Open-Vocabulary Object Detection Using Captions(使用字幕的开放词汇对象检测)
\n[paper](https://arxiv.org/abs/2011.10678)

\n\n[27] Improved Handling of Motion Blur in Online Object Detection(改进在线对象检测中运动模糊的处理)
\n[paper](https://arxiv.org/abs/2011.14448)

\n\n[26] PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery(PSRR MaxpoolNMS：具有关系恢复的金字塔移位MaxpoolNMS)
\n[paper](https://arxiv.org/abs/2105.12990)

\n\n[25] Domain-Specific Suppression for Adaptive Object Detection(领域特定的自适应对象检测抑制)
\n[paper](https://arxiv.org/abs/2105.03570)

\n\n[24] Line Segment Detection Using Transformers without Edges(【线段检测】使用没有边缘的Transformer进行线段检测)
\n[paper](https://arxiv.org/abs/2101.01909)

\n\n[23] IQDet: Instance-wise Quality Distribution Sampling for Object Detection(IQDet：用于对象检测的按实例进行质量分布采样)
\n[paper](https://arxiv.org/abs/2104.06936)

\n\n[22] Adaptive Class Suppression Loss for Long-Tail Object Detection(长尾目标检测的自适应类抑制损失)
\n[paper](https://arxiv.org/abs/2104.00885) | [code](https://github.com/CASIA-IVA-Lab/ACSL)

\n\n[21] DAP: Detection-Aware Pre-training with Weak Supervision(具有弱监督的可感知检测的预训练)
\n[paper](https://arxiv.org/abs/2103.16651)

\n\n[20] Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection(稠密关系蒸馏与上下文感知聚合用于小样本对象检测)
\n[paper](https://arxiv.org/abs/2103.17115) ｜ [code](https://github.com/hzhupku/DCNet)

\n\n[19] Scale-aware Automatic Augmentation for Object Detection(用于物体检测的可感知规模的自动增强)
\n[paper](https://arxiv.org/abs/2103.17220) | [code](https://github.com/Jia-Research-Lab/SA-AutoAug)

\n\n[18] Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection(数据不确定性指导的多阶段学习，用于半监督对象检测)
\n[paper](https://arxiv.org/abs/2103.16368)

\n\n[17] OTA: Optimal Transport Assignment for Object Detection(OTA：用于对象检测的最佳传输分配)
\n[paper](https://arxiv.org/abs/2103.14259) | [code](https://github.com/Megvii-BaseDetection/OTA)

\n\n[16] Distilling Object Detectors via Decoupled Features(通过解耦功能蒸馏物体检测器)
\n[paper](https://arxiv.org/abs/2103.14475) | [code](https://github.com/ggjy/DeFeat.pytorch)

\n\n[15] I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors(I ^ 3Net：用于适应一阶段对象检测器的隐式实例不变网络)
\n[paper](https://arxiv.org/abs/2103.13757)

\n\n[14] Robust and Accurate Object Detection via Adversarial Learning(通过对抗学习进行稳健而准确的目标检测)
\n[paper](https://arxiv.org/abs/2103.13886) | [model](https://github.com/google/automl/tree/master/efficientdet/Det-AdvProp.md)

\n\n[13] You Only Look One-level Feature
\n[paper](https://arxiv.org/pdf/2103.09460.pdf) | [code](https://github.com/megvii-model/YOLOF)
\n解读：[我扔掉FPN来做目标检测，效果竟然这么强！YOLOF开源：你只需要看一层特征](https://mp.weixin.qq.com/s/GMHx3oMZr0qlP9Iy8tWGMw)\n\n[12] End-to-End Object Detection with Fully Convolutional Network(使用全卷积网络进行端到端目标检测)
\n[paper](https://arxiv.org/abs/2012.03544) | [code](https://github.com/Megvii-BaseDetection/DeFCN)
\n解读：[丢弃Transformer，FCN也可以实现E2E检测](https://zhuanlan.zhihu.com/p/332281368)

\n\n[11] FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding(通过对比提案编码进行的小样本目标检测)
\n[paper](https://arxiv.org/abs/2103.05950v2) ｜ [code](https://github.com/MegviiDetection/FSCE)

\n\n[10] Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection(学习可靠的定位质量估计用于密集目标检测)
\n[paper](https://arxiv.org/pdf/2011.12885.pdf) | [code](https://github.com/implus/GFocalV2)
\n解读:[大白话 Generalized Focal Loss V2](https://zhuanlan.zhihu.com/p/313684358)

\n\n[9] MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection(用于类别识别无监督域自适应对象检测)
\n[paper](https://arxiv.org/pdf/2103.04224.pdf)

\n\n[8] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection(一键式路径聚合网络体系结构搜索对象)
\n[paper](https://arxiv.org/abs/2103.04507) | [code](https://github.com/VDIGPKU/OPANAS)

\n\n[7] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)
\n[paper](https://arxiv.org/abs/2103.01903)

\n\n[6] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)
\n[paper](https://arxiv.org/abs/2103.02340)

\n\n[5] Instance Localization for Self-supervised Detection Pretraining(自监督检测预训练的实例定位)
\n[paper](https://arxiv.org/pdf/2102.08318.pdf)｜[code](https://github.com/limbo0000/InstanceLoc)

\n\n[4] Multiple Instance Active Learning for Object Detection（用于对象检测的多实例主动学习）
\n[paper](https://github.com/yuantn/MIAL/raw/master/paper.pdf) | [code](https://github.com/yuantn/MIAL)
\n解读:[MI-AOD: 少量样本实现高检测性能](https://zhuanlan.zhihu.com/p/362764637)
\n\n[3] Towards Open World Object Detection(开放世界中的目标检测)
\n[paper](https://arxiv.org/abs/2103.02603) | [code](https://github.com/JosephKJ/OWOD)
\n解读:[目标检测一卷到底之后，终于有人为它挖了个新坑｜CVPR2021 Oral](https://mp.weixin.qq.com/s/1_WC_hTTc9fgSJUqBsfTYQ)

\n\n[2] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外检测对象的阳性无标签数据提纯)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Guo_Positive-Unlabeled_Data_Purification_in_the_Wild_for_Object_Detection_CVPR_2021_paper.pdf)

\n\n[1] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
\n[paper](https://arxiv.org/pdf/2011.09094.pdf) | [code](https://github.com/dddzg/up-detr)
\n解读：[无监督预训练检测器](https://www.zhihu.com/question/432321109/answer/1606004872)

\n\n\n \n\n### 视频目标检测(Video Object Detection)\n\n[3] Dogfight: Detecting Drones from Drones Videos(从无人机视频中检测无人机)
\n[paper](https://arxiv.org/abs/2103.17242)

\n\n[2] Depth from Camera Motion and Object Detection(相机运动和物体检测的深度)
\n[paper](https://arxiv.org/abs/2103.01468)

\n\n[1] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge(多模态知识提取的自监督多目标检测与有声跟踪)
\n[paper](https://arxiv.org/abs/2103.01353) | [video](https://www.youtube.com/channel/UCRpM8k1GY3kD2TqCo_yKN3g) | [project](http://rl.uni-freiburg.de/research/multimodal-distill)

\n\n\n \n\n### 三维目标检测(3D object detection)\n\n[15] Exploring intermediate representation for monocular vehicle pose estimation(探索单目车辆姿态估计的中间表示)
\n[paper](https://arxiv.org/abs/2011.08464) ｜ [code](https://github.com/Nicholasli1995/EgoNet)

\n\n[14] SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud(SE-SSD：来自点云的自集成单级目标检测器)
\n[paper](https://arxiv.org/abs/2104.09804) | [code](https://github.com/Vegeta2020/SE-SSD)

\n\n[13] Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds(点云中基于投票的3D对象检测的回溯代表点)
\n[paper](https://arxiv.org/abs/2104.06114) | [code](https://github.com/cheng052/BRNet)

\n\n[12] Objects are Different: Flexible Monocular 3D Object Detection(对象不同：灵活的单眼3D对象检测)
\n[paper](https://arxiv.org/abs/2104.02323) | [code](https://github.com/zhangyp15/MonoFlex)

\n\n[11] HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection(HVPR：用于单阶段3D对象检测的混合体素点表示)
\n[paper](https://arxiv.org/abs/2104.00902)

\n\n[10] GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection(用于单眼3D对象检测的数学可微分的分组NMS)
\n[paper](https://arxiv.org/abs/2103.17202) | [code](https://github.com/abhi1kumar/groomed_nms)

\n\n[9] Delving into Localization Errors for Monocular 3D Object Detection(深入研究单目3D对象检测的定位错误)
\n[paper](https://arxiv.org/abs/2103.16237) | [code](https://github.com/xinzhuma/monodle)

\n\n[8] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection(用于单眼3D对象检测的深度条件动态消息传播)
\n[paper](https://arxiv.org/abs/2103.16470) | [code](https://github.com/fudan-zvg/DDMP)

\n\n[7] LiDAR R-CNN: An Efficient and Universal 3D Object Detector(高效且通用的3D对象检测器)
\n[paper](https://arxiv.org/abs/2103.15297) | [code](https://github.com/tusimple/LiDAR_RCNN)

\n\n[6] M3DSSD: Monocular 3D Single Stage Object Detector(单眼3D单级目标检测器)
\n[paper](https://arxiv.org/abs/2103.13164)

\n\n[5] MonoRUn: Monocular 3D Object Detection by Self-Supervised Reconstruction and Uncertainty Propagation(通过自我监督的重构和不确定性传播进行单眼3D目标检测)
\n[paper](https://arxiv.org/abs/2103.12605)

\n\n[4] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection(ST3D：在三维目标检测上进行无监督域自适应的自训练)
\n[paper](https://arxiv.org/pdf/2103.05346.pdf) | [code](https://github.com/CVMI-Lab/ST3D)

\n\n[3] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)
\n[paper](https://arxiv.org/abs/2006.11275) | [code](https://github.com/tianweiy/CenterPoint)

\n\n[2] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection(利用IoU预测进行半监督3D对象检测)
\n[paper](https://arxiv.org/pdf/2012.04355.pdf) | [code](https://github.com/THU17cyz/3DIoUMatch) | [project](https://thu17cyz.github.io/3DIoUMatch/) | [video](https://youtu.be/nuARjhkQN2U)

\n\n[1] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)
\n[paper](https://arxiv.org/abs/2103.01100)

\n\n \n\n### 人物交互检测(HOI Detection)\n\n[7] HOTR: End-to-End Human-Object Interaction Detection with Transformers(HOTR：使用变压器进行端到端的人与对象交互检测)
\n[paper](https://arxiv.org/abs/2104.13682)

\n\n[6] Glance and Gaze: Inferring Action-aware Points for One-Stage Human-Object Interaction Detection(凝视与凝视：推断行动感知点，用于一阶段的人物交互检测)
\n[paper](https://arxiv.org/abs/2104.05269)

\n\n[5] Affordance Transfer Learning for Human-Object Interaction Detection(物价转移学习用于人物交互检测)
\n[paper](https://arxiv.org/abs/2104.02867) | [code](https://github.com/zhihou7/HOI-CL)

\n\n[4] Detecting Human-Object Interaction via Fabricated Compositional Learning(通过人为构图学习检测人与物体的相互作用)
\n[paper](https://arxiv.org/abs/2103.08214) | [code](https://github.com/zhihou7/FCL)

\n\n[3] Reformulating HOI Detection as Adaptive Set Prediction(将人物交互检测重新配置为自适应集预测)
\n[paper](https://arxiv.org/abs/2103.05983) | [code](https://arxiv.org/abs/2103.05983)

\n\n[2] QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information(具有图像范围的上下文信息的基于查询的成对人物交互检测)
\n[paper](https://arxiv.org/abs/2103.05399) | [code](https://github.com/hitachi-rd-cv/qpic)

\n\n[1] End-to-End Human Object Interaction Detection with HOI Transformer(使用HOI Transformer进行端到端的人类对象交互检测)
\n[paper](https://arxiv.org/pdf/2103.04503.pdf) | [code](https://github.com/bbepoch/HoiTransformer)

\n\n \n\n### 伪装目标检测(Camouflaged Object Detection)\n\n[4] Camouflaged Object Segmentation with Distraction Mining(带有干扰挖掘的伪装对象分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Mei_Camouflaged_Object_Segmentation_With_Distraction_Mining_CVPR_2021_paper.pdf) | [code](https://mhaiyang.github.io/CVPR2021_PFNet/index)

\n\n[3] Mutual Graph Learning for Camouflaged Object Detection(用于伪装目标检测的互图学习)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zhai_Mutual_Graph_Learning_for_Camouflaged_Object_Detection_CVPR_2021_paper.pdf)

\n\n[2] Uncertainty-aware Joint Salient Object and Camouflaged Object Detection(不确定度联合显着物体和伪装物体检测)
\n[paper](https://arxiv.org/abs/2104.02628)

\n\n[1] Simultaneously Localize, Segment and Rank the Camouflaged Objects(同时定位，分割和排序伪装的对象)
\n[paper](https://arxiv.org/abs/2103.04011) | [code](https://github.com/JingZhang617/COD-Rank-Localize-and-Segment)

\n\n\n \n\n### 旋转目标检测(Rotation Object Detection)\n\n[2] ReDet: A Rotation-equivariant Detector for Aerial Object Detection(ReDet：用于航空物体检测的等速旋转检测器)
\n[paper](https://arxiv.org/abs/2103.07733) | [code](https://github.com/csuhan/ReDet)

\n\n[1] Dense Label Encoding for Boundary Discontinuity Free Rotation Detection(密集标签编码，用于边界不连续自由旋转检测)
\n[paper](https://arxiv.org/abs/2011.09670) | [code](https://github.com/yangxue0827/RotationDetection) | [解读-DCL：旋转目标检测新方法](https://zhuanlan.zhihu.com/p/354373013)

\n\n \n\n### 显著性检测(Saliency Object Detection)\n\n[4] Calibrated RGB-D Salient Object Detection(校准的 RGB-D 显著物体检测)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Ji_Calibrated_RGB-D_Salient_Object_Detection_CVPR_2021_paper.pdf) | [code](https://github.com/jiwei0921/DCF)

\n\n[3] Weakly Supervised Video Salient Object Detection(弱监督视频显著性目标检测)
\n[paper](https://arxiv.org/abs/2104.02391)

\n\n[2] Group Collaborative Learning for Co-Salient Object Detection(协同显著性目标检测的小组协作学习)
\n[paper](https://arxiv.org/abs/2104.01108) | [project](https://github.com/fanq15/GCoNet)

\n\n[1] Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion(具有深度敏感注意力和自动多模态融合的深度RGB-D显著性检测)
\n[paper](https://arxiv.org/abs/2103.11832)

\n\n \n\n### 图像异常检测(Anomally Detection in Image)\n\n[7] Anomaly Detection in Video via Self-Supervised and Multi-Task Learning(通过自我监督和多任务学习进行视频异常检测)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Georgescu_Anomaly_Detection_in_Video_via_Self-Supervised_and_Multi-Task_Learning_CVPR_2021_paper.pdf)

\n\n[6] MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection(用于视频异常检测的多实例自训练框架)
\n[paper](https://arxiv.org/abs/2104.01633)

\n\n[5] CutPaste: Self-Supervised Learning for Anomaly Detection and Localization(CutPaste：用于异常检测和定位的自我监督学习)
\n[paper](https://arxiv.org/abs/2104.04015)

\n\n[4] CutPaste: Self-Supervised Learning for Anomaly Detection and Localization(CutPaste：用于异常检测和定位的自监督学习)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Li_CutPaste_Self-Supervised_Learning_for_Anomaly_Detection_and_Localization_CVPR_2021_paper.pdf)

\n\n[3] Pixel-wise Anomaly Detection in Complex Driving Scenes(复杂驾驶场景中的逐像素异常检测)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Di_Biase_Pixel-Wise_Anomaly_Detection_in_Complex_Driving_Scenes_CVPR_2021_paper.pdf)

\n\n[2] PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation(PANDA：调整用于异常检测和分割的预训练特征)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Reiss_PANDA_Adapting_Pretrained_Features_for_Anomaly_Detection_and_Segmentation_CVPR_2021_paper.pdf) | [code](https://github.com/talreiss/PANDA)

\n\n[1] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)
\n[paper](https://arxiv.org/abs/2011.11108)

\n\n \n\n### 关键点检测(Keypoint Detection)\n\n[1] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并：无监督的对准关键点检测器)
\n[paper](https://arxiv.org/pdf/2103.10814.pdf) | [code](https://github.com/eliphatfs/SkeletonMerger)

\n\n\n
\n \n\n## 分割(Segmentation)\n\n \n\n## 图像分割(Image Segmentation)\n\n[14] ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation(ATSO：半监督图像分割的异步师生优化)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Huo_ATSO_Asynchronous_Teacher-Student_Optimization_for_Semi-Supervised_Image_Segmentation_CVPR_2021_paper.pdf)

\n\n[13] Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation(用于【基于文本的图像分割】的具有协同注意力嵌入的编码器融合网络)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Feng_Encoder_Fusion_Network_With_Co-Attention_Embedding_for_Referring_Image_Segmentation_CVPR_2021_paper.pdf)

\n\n[12] Bottom-Up Shift and Reasoning for Referring Image Segmentation(【基于文本的图像分割】的自底向上移位和推理)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Yang_Bottom-Up_Shift_and_Reasoning_for_Referring_Image_Segmentation_CVPR_2021_paper.pdf) | [code](https://github.com/incredibleXM/BUSNet)

\n\n[11] Every Annotation Counts: Multi-label Deep Supervision for Medical Image Segmentation(每种注释都至关重要：【医学图像分割】的多标签深度监管)
\n[paper](https://arxiv.org/abs/2104.13243)

\n\n[10] Camouflaged Object Segmentation with Distraction Mining(【伪装目标分割】基于分心挖掘的伪装目标分割)
\n[paper](https://arxiv.org/abs/2104.10475)

\n\n[9] Adaptive Prototype Learning and Allocation for Few-Shot Segmentation(【小样本分割】的自适应原型学习和分配)
\n[paper](https://arxiv.org/abs/2104.01893)

\n\n[8] DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation(DiNTS：用于3D【医学图像分割】的可区分神经网络拓扑搜索)
\n[paper](https://arxiv.org/abs/2103.15954)

\n\n[7] Self-Guided and Cross-Guided Learning for Few-Shot Segmentation(自我指导和交叉指导学习，用于【小样本分割】)
\n[paper](https://arxiv.org/abs/2103.16129)

\n\n[6] Locate then Segment: A Strong Pipeline for Referring Image Segmentation(找到然后分割：用于【基于文本的图像分割】的强大管道)
\n[paper](https://arxiv.org/abs/2103.16284)

\n\n[5] Boundary IoU: Improving Object-Centric Image Segmentation Evaluation(边界IoU：改进以对象为中心的图像分割评估)
\n[paper](https://arxiv.org/abs/2103.16562) | [code](https://bowenc0221.github.io/boundary-iou)

\n\n[4] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行【航空图像分割】)
\n[paper](https://arxiv.org/pdf/2103.06564.pdf)

\n\n[3] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行【医学图像分割】的联合域泛化)
\n[paper](https://arxiv.org/abs/2103.06030) | [code](https://github.com/liuquande/FedDG-ELCFS)

\n\n[2] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?(【小样本分割】没有元学习的小样本分割：你只需要一个好的转换推论？)
\n[paper](https://arxiv.org/abs/2012.06166) | [code](https://github.com/mboudiaf/RePRI-for-Few-Shot-Segmentation)

\n\n[1] Learning Calibrated Medical Image Segmentation via Multi-rater Agreement Modeling(通过多评分者协议建模学习校准的【医学图像分割】)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Ji_Learning_Calibrated_Medical_Image_Segmentation_via_Multi-Rater_Agreement_Modeling_CVPR_2021_paper.pdf) ｜ [code](https://github.com/jiwei0921/MRNet/)

\n\n\n \n\n### 全景分割(Panoptic Segmentation)\n\n[15] Fully Convolutional Networks for Panoptic Segmentation(Oral | 用于全景分割的全卷积网络)
\n[paper](https://arxiv.org/abs/2012.00720) | [code](https://github.com/Jia-Research-Lab/PanopticFCN)

\n\n[14] Part-aware Panoptic Segmentation(部分感知全景分割)
\n[paper](https://arxiv.org/abs/2106.06351) | [code](https://github.com/tue-mps/panoptic_parts)

\n\n[13] LiDAR-based Panoptic Segmentation via Dynamic Shifting Network(通过动态移动网络基于 LiDAR 的全景分割)
\n[paper](https://arxiv.org/abs/2011.11964) | [code](https://github.com/hongfz16/DS-Net)

\n\n[12] Learning to Associate Every Segment for Video Panoptic Segmentation(学习关联视频全景分割的每个片段)
\n[paper](https://arxiv.org/abs/2106.09453)

\n\n[11] Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation(用于无提议全景分割的分层 Lova ́sz 嵌入)
\n[paper](https://arxiv.org/abs/2106.04555)

\n\n[10] Exemplar-Based Open-Set Panoptic Segmentation Network(基于范例的开放集全景分割网络)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Hwang_Exemplar-Based_Open-Set_Panoptic_Segmentation_Network_CVPR_2021_paper.pdf) | [project](https://cv.snu.ac.kr/research/EOPSN)

\n\n[9] ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation()
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Qiao_VIP-DeepLab_Learning_Visual_Perception_With_Depth-Aware_Video_Panoptic_Segmentation_CVPR_2021_paper.pdf) | [code]( https://github.com/joe-siyuan-qiao/ViP-DeepLab)

\n\n[8] LPSNet: A lightweight solution for fast panoptic segmentation(LPSNet：快速全景分割的轻量级解决方案)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Hong_LPSNet_A_Lightweight_Solution_for_Fast_Panoptic_Segmentation_CVPR_2021_paper.pdf)

\n\n[7] Improving Panoptic Segmentation at All Scales(改进所有尺度的全景分割)
\n[paper](http://arxiv.org/abs/2012.07717)

\n\n[6] Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation(面向弱监督全景分割的联合物物挖掘)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Shen_Toward_Joint_Thing-and-Stuff_Mining_for_Weakly_Supervised_Panoptic_Segmentation_CVPR_2021_paper.pdf)

\n\n[5] MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers(MaX-DeepLab：使用掩模转换器进行端到端全景分割)
\n[paper](https://arxiv.org/abs/2012.00759) ｜ [code](https://github.com/google-research/deeplab2)

\n\n[4] Panoptic Segmentation Forecasting(全景分割预测)
\n[paper](https://arxiv.org/abs/2104.03962)

\n\n[3] Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation(无提案的LiDAR点云全景分割)
\n[paper](https://arxiv.org/pdf/2103.14962.pdf)

\n\n[2] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)
\n[paper](https://arxiv.org/abs/2103.02584)

\n\n[1] 4D Panoptic LiDAR Segmentation（4D全景LiDAR分割）
\n[paper](https://arxiv.org/abs/2102.12472)

\n\n \n\n### 语义分割(Semantic Segmentation)\n\n[49] DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation(DCNAS：语义图像分割的密集连接神经架构搜索)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_DCNAS_Densely_Connected_Neural_Architecture_Search_for_Semantic_Image_Segmentation_CVPR_2021_paper.pdf)

\n\n[48] Semi-supervised Semantic Segmentation with Directional Context-aware Consistency(具有定向上下文感知一致性的半监督语义分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Lai_Semi-Supervised_Semantic_Segmentation_With_Directional_Context-Aware_Consistency_CVPR_2021_paper.pdf)

\n\n[47] Scale-Aware Graph Neural Network for Few-Shot Semantic Segmentation(用于小样本语义分割的尺度感知图神经网络)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Xie_Scale-Aware_Graph_Neural_Network_for_Few-Shot_Semantic_Segmentation_CVPR_2021_paper.pdf)

\n\n[46] Uncertainty Reduction for Model Adaptation in Semantic Segmentation(语义分割中模型自适应的不确定性降低)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/S_Uncertainty_Reduction_for_Model_Adaptation_in_Semantic_Segmentation_CVPR_2021_paper.pdf)

\n\n[45] Rethinking BiSeNet For Real-time Semantic Segmentation(重新思考用于实时语义分割的 BiSeNet)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Fan_Rethinking_BiSeNet_for_Real-Time_Semantic_Segmentation_CVPR_2021_paper.pdf) | [code](https://github.com/ MichaelFan01/STDC-Seg)

\n\n[44] HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation(HyperSeg：用于实时语义分割的 Patch-wise Hypernetwork)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Nirkin_HyperSeg_Patch-Wise_Hypernetwork_for_Real-Time_Semantic_Segmentation_CVPR_2021_paper.pdf) | [code](https://nirkin.com/hyperseg)

\n\n[43] Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds(完整和标签：激光雷达点云语义分割的域适应方法)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Yi_Complete__Label_A_Domain_Adaptation_Approach_to_Semantic_Segmentation_CVPR_2021_paper.pdf)

\n\n[42] Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation(弱监督语义分割的非显著区域对象挖掘)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Yao_Non-Salient_Region_Object_Mining_for_Weakly_Supervised_Semantic_Segmentation_CVPR_2021_paper.pdf) ｜ [code](https://github.com/NUST-Machine-Intelligence-Laboratory/nsrom)

\n\n[41] Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation(域自适应语义分割的原型伪标签去噪和目标结构学习)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_Prototypical_Pseudo_Label_Denoising_and_Target_Structure_Learning_for_Domain_CVPR_2021_paper.pdf) | [code](https://github.com/microsoft/ProDA)

\n\n[40] Few-shot 3D Point Cloud Semantic Segmentation(小样本 3D 点云语义分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zhao_Few-Shot_3D_Point_Cloud_Semantic_Segmentation_CVPR_2021_paper.pdf)

\n\n[39] Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation(弱监督语义分割的嵌入式判别注意机制)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Wu_Embedded_Discriminative_Attention_Mechanism_for_Weakly_Supervised_Semantic_Segmentation_CVPR_2021_paper.pdf)

\n\n[38] Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation(针对小样本语义分割的抗锯齿语义重建)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_Anti-Aliasing_Semantic_Reconstruction_for_Few-Shot_Semantic_Segmentation_CVPR_2021_paper.pdf) ｜ [code](https://github.com/Bibkiller/ASR)

\n\n[37] Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation(使用自监督深度估计改进语义分割的三种方法)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Hoyer_Three_Ways_To_Improve_Semantic_Segmentation_With_Self-Supervised_Depth_Estimation_CVPR_2021_paper.pdf) | [code](https://github.com/lhoyer/improving_segmentation_ with_selfsupervised_depth)

\n\n[36] (AF) -S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network(用于稀疏语义分割网络的具有自适应特征选择的注意力特征融合)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Cheng_AF2-S3Net_Attentive_Feature_Fusion_With_Adaptive_Feature_Selection_for_Sparse_CVPR_2021_paper.pdf)

\n\n[35] Exploit Visual Dependency Relations for Semantic Segmentation(利用视觉依赖关系进行语义分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_Exploit_Visual_Dependency_Relations_for_Semantic_Segmentation_CVPR_2021_paper.pdf)

\n\n[34] Revisiting Superpixels for Active Learning in Semantic Segmentation with Realistic Annotation Costs(在具有现实注释成本的语义分割中重新审视用于主动学习的超像素)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Cai_Revisiting_Superpixels_for_Active_Learning_in_Semantic_Segmentation_With_Realistic_CVPR_2021_paper.pdf)

\n\n[33] ABMDRNet: Adaptive-weighted Bi-directional Modality Difference Reduction Network for RGB-T Semantic Segmentation(ABMDRNet：用于 RGB-T 语义分割的自适应加权双向模态差异减少网络)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_ABMDRNet_Adaptive-Weighted_Bi-Directional_Modality_Difference_Reduction_Network_for_RGB-T_Semantic_CVPR_2021_paper.pdf)

\n\n[32] CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation(CGA-Net：点云语义分割的类别引导聚合)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Lu_CGA-Net_Category_Guided_Aggregation_for_Point_Cloud_Semantic_Segmentation_CVPR_2021_paper.pdf)

\n\n[31] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers(使用 Transformer 从序列到序列的角度重新思考语义分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zheng_Rethinking_Semantic_Segmentation_From_a_Sequence-to-Sequence_Perspective_With_Transformers_CVPR_2021_paper.pdf) | [project](https://fudan-zvg.github.io/SETR)

\n\n[30] Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation(弱和半监督语义分割的反对抗操纵属性)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Lee_Anti-Adversarially_Manipulated_Attributions_for_Weakly_and_Semi-Supervised_Semantic_Segmentation_CVPR_2021_paper.pdf) | [code](https://github.com/jbeomlee93/AdvCAM)

\n\n[29] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision(具有交叉伪监督的半监督语义分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Chen_Semi-Supervised_Semantic_Segmentation_With_Cross_Pseudo_Supervision_CVPR_2021_paper.pdf)

\n\n[28] Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation(铁路不是火车：作为弱监督语义分割的伪像素监督的显著性)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Lee_Railroad_Is_Not_a_Train_Saliency_As_Pseudo-Pixel_Supervision_for_CVPR_2021_paper.pdf) ｜ [code](https://github.com/halbielee/EPS)

\n\n[27] Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation(集群、拆分、融合和更新：开放复合域自适应语义分割的元学习)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Gong_Cluster_Split_Fuse_and_Update_Meta-Learning_for_Open_Compound_Domain_CVPR_2021_paper.pdf)

\n\n[26] Omni-supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning(通过渐进感受野分量推理的全方位监督点云分割)
\n[paper](https://arxiv.org/abs/2105.10203) | [code](https://github.com/azuki-miho/RFCR)

\n\n[25] Unsupervised Part Segmentation through Disentangling Appearance and Shape(基于外观和形状分离的无监督零件分割)
\n[paper](https://arxiv.org/abs/2105.12405)

\n\n[24] Self-supervised Augmentation Consistency for Adapting Semantic Segmentation(自适应语义分割的自我监督增强一致性)
\n[paper](https://arxiv.org/abs/2105.00097) | [code](https://github.com/visinf/da-sac)

\n\n[23] DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation(DANNet：一种用于无监督夜间语义切分的单阶段域自适应网络)
\n[paper](https://arxiv.org/abs/2104.10834) | [code](https://github. com/W-zx-Y/DANNet)

\n\n[22] Improving Online Performance Prediction for Semantic Segmentation(改进用于语义分割的在线性能预测)
\n[paper](https://arxiv.org/abs/2104.05255)

\n\n[21] Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization(生成模型的语义分割：半监督学习和强大的域外泛化)
\n[paper](https://arxiv.org/abs/2104.05833) ｜ [code](https://nv-tlabs.github.io/semanticGAN/)

\n\n[20] Progressive Semantic Segmentation(渐进式语义分割)
\n[paper](https://arxiv.org/abs/2104.03778) | [code](https://github.com/VinAIResearch/MagNet)

\n\n[19] InverseForm: A Loss Function for Structured Boundary-Aware Segmentation(结构化边界感知分割的损失函数)
\n[paper](https://arxiv.org/abs/2104.02745)

\n\n[18] 3D-to-2D Distillation for Indoor Scene Parsing(用于室内场景解析的3D到2D蒸馏)
\n[paper](https://arxiv.org/abs/2104.02243)

\n\n[17] One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation(一键式点击：一种用于弱监督3D语义分割的自训练方法)
\n[paper](https://arxiv.org/abs/2104.02246)

\n\n[16] Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation(弱监督语义分割的背景感知池和噪声感知损失)
\n[paper](https://arxiv.org/abs/2104.00905)

\n\n[15] PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering(PiCIE：在聚类中使用不变性和等方差的无监督语义分割)
\n[paper](https://arxiv.org/abs/2103.17070) | [code](https://github.com/janghyuncho/PiCIE)

\n\n[14] Source-Free Domain Adaptation for Semantic Segmentation(用于语义分割的无源域自适应)
\n[paper](https://arxiv.org/abs/2103.16372)

\n\n[13] RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening(通过实例选择性增白提高城市场景分割中的域泛化)
\n[paper](https://arxiv.org/abs/2103.15597) | [code](https://github.com/shachoi/RobustNet)

\n\n[12] Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization(具有光度对齐和类别中心正则化的粗到细域自适应语义分割)
\n[paper](https://arxiv.org/abs/2103.13041)

\n\n[11] Cross-Dataset Collaborative Learning for Semantic Segmentation(跨数据集协同学习的语义分割)
\n[paper](https://arxiv.org/abs/2103.11351)

\n\n[10] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)
\n[paper](https://arxiv.org/abs/2103.08907)

\n\n[9] Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations(通过稀疏和纠缠的潜在表示的排斥力进行连续语义分割)
\n[paper](https://arxiv.org/abs/2103.06342)

\n\n[8] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)
\n[paper](https://arxiv.org/abs/2103.07074)

\n\n[7] Capturing Omni-Range Context for Omnidirectional Segmentation(捕获全方位上下文进行全方位分割)
\n[paper](https://arxiv.org/abs/2103.05687)

\n\n[6] MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation(MetaCorrection：语义分割中无监督域自适应的域感知元丢失校正)
\n[paper](https://arxiv.org/abs/2103.05254)

\n\n[5] Learning Statistical Texture for Semantic Segmentation(学习用于语义分割的统计纹理)
\n[paper](https://arxiv.org/abs/2103.04133)

\n\n[4] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)
\n[paper](https://arxiv.org/pdf/2103.04705.pdf)

\n\n[3] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)
\n[paper](https://arxiv.org/abs/2103.04717)

\n\n[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)
\n[paper](https://arxiv.org/abs/2009.03137) | [code](https://github.com/QingyongHu/SensatUrban)

\n\n[1] PLOP: Learning without Forgetting for Continual Semantic Segmentation（PLOP：学习而不会忘记连续的语义分割）
\n[paper](https://arxiv.org/abs/2011.11390) ｜ [code](https://github.com/arthurdouillard/CVPR2021_PLOP)

\n\n \n\n### 实例分割(Instance Segmentation)\n\n[24] BoxInst: High-Performance Instance Segmentation with Box Annotations(BoxInst：带框注释的高性能实例分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Tian_BoxInst_High-Performance_Instance_Segmentation_With_Box_Annotations_CVPR_2021_paper.pdf) | [code](https://git.io/AdelaiDet)

\n\n[23] Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision(使用分层自监督在实例分割中无监督地发现长尾)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Weng_Unsupervised_Discovery_of_the_Long-Tail_in_Instance_Segmentation_Using_Hierarchical_CVPR_2021_paper.pdf)

\n\n[22] Seesaw Loss for Long-Tailed Instance Segmentation(长尾实例分割的跷跷板损失)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Wang_Seesaw_Loss_for_Long-Tailed_Instance_Segmentation_CVPR_2021_paper.pdf) | [code](https://github.com/open-mmlab/mmdetection)

\n\n[21] ColorRL: Reinforced Coloring for End-to-End Instance Segmentation(ColorRL：端到端实例分割的强化着色)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Tuan_ColorRL_Reinforced_Coloring_for_End-to-End_Instance_Segmentation_CVPR_2021_paper.pdf)

\n\n[20] SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation(SG-Net：用于单阶段视频实例分割的空间粒度网络)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_SG-Net_Spatial_Granularity_Network_for_One-Stage_Video_Instance_Segmentation_CVPR_2021_paper.pdf)

\n\n[19] Deeply Shape-guided Cascade for Instance Segmentation(用于实例分割的深度形状引导级联)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Ding_Deeply_Shape-Guided_Cascade_for_Instance_Segmentation_CVPR_2021_paper.pdf) | [code](https://github.com/hding2455/DSC)

\n\n[18] DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation(DCT-Mask：用于实例分割的离散余弦变换掩码表示)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Shen_DCT-Mask_Discrete_Cosine_Transform_Mask_Representation_for_Instance_Segmentation_CVPR_2021_paper.pdf)

\n\n[17] Point Cloud Instance Segmentation using Probabilistic Embeddings(使用概率嵌入的点云实例分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_Point_Cloud_Instance_Segmentation_Using_Probabilistic_Embeddings_CVPR_2021_paper.pdf)

\n\n[16] Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation(简单的复制粘贴是一种用于实例分割的强数据增强方法)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Ghiasi_Simple_Copy-Paste_Is_a_Strong_Data_Augmentation_Method_for_Instance_CVPR_2021_paper.pdf) | [code](https://github.com/tensorflow/tpu/tree/master/models/ official/detection/projects/copy_paste)

\n\n[15] Zero-Shot Instance Segmentation(零样本实例分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Zheng_Zero-Shot_Instance_Segmentation_CVPR_2021_paper.pdf)

\n\n[14] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution(DyCo3D：通过动态卷积对 3D 点云进行稳健的实例分割)
\n[paper](https://arxiv.org/abs/2011.13328) | [code](https://git.io/DyCo3D)

\n\n[13] Incremental Few-Shot Instance Segmentation(增量小样本实例分割)
\n[paper](https://arxiv.org/abs/2105.05312) ｜ [code](https://github.com/danganea/iMTFA)

\n\n[12] Robust Instance Segmentation through Reasoning about Multi-Object Occlusion(通过推理多对象遮挡进行鲁棒的实例分割)
\n[paper](https://arxiv.org/abs/2012.02107) | [code](https://github.com/XD7479/Multi-Object-Occlusion)

\n\n[11] A^2-FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation(A ^ 2-FPN：基于注意力聚合的特征金字塔网络，用于实例分割)
\n[paper](https://arxiv.org/abs/2105.03186)

\n\n[10] RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features(RefineMask：通过细粒度功能实现高质量实例分割)
\n[paper](https://arxiv.org/abs/2104.08569) | [code](https://github.com/zhanggang001/RefineMask)

\n\n[9] Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation(看起来更接近以更好地分割：用于实例分割的边界补丁优化)
\n[paper](https://arxiv.org/abs/2104.05239) | [code](https://github.com/tinyalpha/BPR)

\n\n[8] Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation(空间特征校准和时间融合，以实现有效的一级视频实例分割)
\n[paper](https://arxiv.org/abs/2104.05606) | [code](https://github.com/MinghanLi/STMask)

\n\n[7] DARCNN: Domain Adaptive Region-based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images(DARCNN：用于生物医学图像中无监督实例分割的基于域自适应区域的卷积神经网络)
\n[paper](https://arxiv.org/abs/2104.01325)

\n\n[6] Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images(通过带有显着图像的类不可知学习进行弱监督实例分割)
\n[paper](https://arxiv.org/abs/2104.01526) | [code](https://github.com/hustvl/BoxCaseg)

\n\n[5] FAPIS: A Few-shot Anchor-free Part-based Instance Segmenter(FAPIS：少量基于无锚的基于实例分割器)
\n[paper](https://arxiv.org/abs/2104.00073)

\n\n[4] Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency(具有时间掩码一致性的视频的弱监督实例分割)
\n[paper](https://arxiv.org/abs/2103.12886)

\n\n[3] Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers(具有重叠BiLayer的深度遮挡感知实例分割)
\n[paper](https://arxiv.org/abs/2103.12340) | [code](https://github.com/lkeab/BCNet)

\n\n[2] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)
\n[paper](https://arxiv.org/abs/2103.08907) | [code](https://github.com/jbeomlee93/BBAM)

\n\n[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)
\n[paper](https://arxiv.org/abs/2011.14503) | [code](https://github.com/Epiphqny/VisTR)\n

\n\n \n\n## 超像素(Superpixel)\n\n[2] Revisiting Superpixels for Active Learning in Semantic Segmentation with Realistic Annotation Costs(在具有现实注释成本的语义分割中重新审视用于主动学习的超像素)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Cai_Revisiting_Superpixels_for_Active_Learning_in_Semantic_Segmentation_With_Realistic_CVPR_2021_paper.pdf)

\n\n[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)
\n[paper](https://arxiv.org/pdf/2103.10681.pdf)

\n\n \n\n### 视频目标分割(Video Object Segmentation)\n\n[12] Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation(在半监督视频对象分割中使用重用门函数学习动态网络)
\n[paper](https://arxiv.org/abs/2012.11655) ｜ [code](https://github.com/HYOJINPARK/Reuse_VOS)

\n\n[11] SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation(Oral|SSTVOS：用于视频对象分割的稀疏时空变换器)
\n[paper](https://arxiv.org/abs/2101.08833) ｜ [code](https://github.com/dukebw/SSTVOS)

\n\n[10] SwiftNet: Real-time Video Object Segmentation(SwiftNet：实时视频对象分割)
\n[paper](https://arxiv.org/abs/2102.04604) | [code](https://github.com/haochenheheda/SwiftNet)

\n\n[9] Video Object Segmentation Using Global and Instance Embedding Learning(使用全局和实例嵌入学习的视频对象分割)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Ge_Video_Object_Segmentation_Using_Global_and_Instance_Embedding_Learning_CVPR_2021_paper.pdf)

\n\n[8] Delving Deep into Many-to-many Attention for Few-shot Video Object Segmentation(深入研究小样本视频对象分割的多对多注意力)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Chen_Delving_Deep_Into_Many-to-Many_Attention_for_Few-Shot_Video_Object_Segmentation_CVPR_2021_paper.pdf) ｜ [code](https://github.com/scutpaul/DANet)

\n\n[7] Reciprocal Transformations for Unsupervised Video Object Segmentation(无监督视频对象分割的互易变换)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Ren_Reciprocal_Transformations_for_Unsupervised_Video_Object_Segmentation_CVPR_2021_paper.pdf) | [code](https://github.com/OliverRensu/RTNet)

\n\n[6] Learning Position and Target Consistency for Memory-based Video Object Segmentation(基于内存的视频对象分割的学习位置和目标一致性)
\n[paper](https://arxiv.org/abs/2104.04329)

\n\n[5] Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps(基于可靠性的注意映射引导交互式视频对象分割)
\n[paper](https://arxiv.org/abs/2104.10386) | [code](https://github.com/yuk6heo/GIS-RAmap)

\n\n[4] Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation(无监督视频多对象分割的目标感知对象发现和关联)
\n[paper](https://arxiv.org/abs/2104.04782)

\n\n[3] Efficient Regional Memory Network for Video Object Segmentation(用于视频对象分割的高效区域存储网络)
\n[paper](https://arxiv.org/abs/2103.12934) | [code](https://haozhexie.com/project/rmnet)

\n\n[2] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild(学习推荐帧用于交互式野外视频对象分割)
\n[paper](https://arxiv.org/pdf/2103.10391.pdf) | [code](https://github.com/svip-lab/IVOS-W)

\n\n[1] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion(模块化交互式视频对象分割：面具交互，传播和差异感知融合)
\n[paper](https://arxiv.org/pdf/2103.07941.pdf) | [project](https://hkchengrex.github.io/MiVOS/)

\n\n \n\n### 抠图(Matting)\n\n[7] Deep Video Matting via Spatio-Temporal Alignment and Aggregation(通过时空对齐和聚合的深度视频抠图)
\n[paper](https://arxiv.org/abs/2104.11208)

\n\n[6] Learning Affinity-Aware Upsampling for Deep Image Matting(学习用于深度图像抠图的亲和感知上采样)
\n[paper](https://arxiv.org/abs/2011.14288)

\n\n[5] Semantic Image Matting(语义图像抠图)
\n[paper](https://arxiv.org/abs/2104.08201) ｜ [code](https://github.com/nowsyn/SIM)

\n\n[4] Mask Guided Matting via Progressive Refinement Network(通过渐进式细化网络的掩码引导抠图)
\n[paper](https://arxiv.org/abs/2012.06722）| [code](https://github.com/yucornetto/MGMatting)

\n\n[3] Omnimatte: Associating Objects and Their Effects in Video(Omnimatte：在视频中关联对象及其效果)
\n[paper](https://arxiv.org/abs/2105.06993) | [project](https://omnimatte.github.io/)

\n\n[2] Improved Image Matting via Real-time User Clicks and Uncertainty Estimation(通过实时用户点击和不确定性估计改善图像抠图)
\n[paper](https://arxiv.org/abs/2012.08323)

\n\n[1] Real-Time High Resolution Background Matting
\n[paper](https://arxiv.org/abs/2012.07810) | [code](https://github.com/PeterL1n/BackgroundMattingV2) | [project](https://grail.cs.washington.edu/projects/background-matting-v2/) | [video](https://youtu.be/oMfPTeYDF9g)

\n\n \n\n### 密集预测(Dense Prediction)\n\n[3] Generic Perceptual Loss for Modeling Structured Output Dependencies(用于建模结构化输出依存关系的一般感知损失)
\n[paper](https://arxiv.org/pdf/2103.10571.pdf)

\n\n[2]Densely connected multidilated convolutional networks for dense prediction tasks（用于密集预测任务的多重卷积连接网络）
\n[paper](https://arxiv.org/abs/2011.11844)

\n\n[1] Dense Contrastive Learning for Self-Supervised Visual Pre-Training(自监督视觉预训练的密集对比学习)
\n[paper](https://arxiv.org/abs/2011.09157) | [code](https://github.com/WXinlong/DenseCL)

\n\n \n\n## 估计(Estimation)\n\n \n\n### 姿态估计(Human Pose Estimation)\n\n[23] Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video(超越视频中时间一致的 3D 人体姿势和形状的静态特征)
\n[paper](https://arxiv.org/abs/2011.08627) | [code](https://github.com/hongsukchoi/TCMR_RELEASE) | [video](https://youtu.be/WB3nTnSQDII)

\n\n[22] ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search(ViPNAS：通过神经架构搜索进行高效的视频姿态估计)
\n[paper](https://arxiv.org/abs/2105.10154)

\n\n[21] When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks(当人体姿势估计达到稳健性时：对抗算法和基准)
\n[paper](https://arxiv.org/abs/2105.06152)

\n\n[20] Monocular Real-time Full Body Capture with Inter-part Correlations(具有部分间相关性的单眼实时全身捕获)
\n[paper](https://arxiv.org/abs/2012.06087)

\n\n[19] Unsupervised Human Pose Estimation through Transforming Shape Templates(通过变换形状模板的无监督人体姿势估计)
\n[paper](https://arxiv.org/abs/2105.04154) | [project](https://infantmotion.github.io/)

\n\n[18] Body Meshes as Points(身体网格物体为点)
\n[paper](https://arxiv.org/abs/2105.02467) | [code](https://github.com/jfzhang95/BMP)

\n\n[17] PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation(PoseAug：用于3D人类姿势估计的可微分姿势增强框架)
\n[paper](https://arxiv.org/abs/2105.02465) | [code](https://github.com/jfzhang95/PoseAug)

\n\n[16] AGORA: Avatars in Geography Optimized for Regression Analysis(AGORA：针对回归分析进行了优化的地理头像)
\n[paper](https://arxiv.org/abs/2104.14643) | [project](https://agora.is.tue.mpg.de)

\n\n[15] Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration(用于3D人体网格配准的局部感知分段变换字段)
\n[paper](https://arxiv.org/abs/2104.08160) | [code](https://taconite.github.io/PTF/website/PTF.html)

\n\n[14] Pose Recognition with Cascade Transformers(级联Transformer的姿势识别)
\n[paper](https://arxiv.org/abs/2104.06976) | [code](https://github.com/mlpc-ucsd/PRTR)

\n\n[13] Lite-HRNet: A Lightweight High-Resolution Network(Lite-HRNet：轻巧的高分辨率网络)
\n[paper](https://arxiv.org/abs/2104.06403) | [code](https://github.com/HRNet/Lite-HRNet)

\n\n[12] Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo(具有平面扫描立体声的多视图多人3D姿势估计)
\n[paper](https://arxiv.org/abs/2104.02273) | [code](https://github.com/jiahaoLjh/PlaneSweepPose)

\n\n[11] Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression(通过解聚的关键点自下而上的人体姿势估计)
\n[paper](https://arxiv.org/abs/2104.02300) | [code](https://github.com/HRNet/DEKR)

\n\n[10] Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks(通过集成自上而下和自下而上的网络进行单眼3D多人姿势估计)
\n[paper](https://arxiv.org/abs/2104.01797) | [code](https://github.com/3dpose/3D-Multi-Person-Pose)

\n\n[9] Reconstructing 3D Human Pose by Watching Humans in the Mirror(通过照镜子中的人来重建3D人的姿势)
\n[paper](https://arxiv.org/abs/2104.00340) | [project](https://zju3dv.github.io/Mirrored-Human/)

\n\n[8] SimPoE: Simulated Character Control for 3D Human Pose Estimation(用于3D人体姿势估计的模拟角色控制)
\n[paper](https://arxiv.org/abs/2104.00683) | [project](https://www.ye-yuan.com/simpoe/)

\n\n[7] Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors(人体姿势系统（HPS）：人体安装传感器在大场景中的3D人体姿势估计和自定位)
\n[paper](https://arxiv.org/abs/2103.17265) | [project](http://virtualhumans.mpi-inf.mpg.de/hps/)

\n\n[6] Graph Stacked Hourglass Networks for 3D Human Pose Estimation(用于3D人体姿势估计的图形堆叠沙漏网络)
\n[paper](https://arxiv.org/abs/2103.16385)

\n\n[5] From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation(【动物姿态估计】从合成到真实：用于动物姿势估计的无监督域自适应)
\n[paper](https://arxiv.org/abs/2103.14843) | [code](https://github.com/chaneyddtt/UDA-Animal-Pose)

\n\n[4] DCPose: Deep Dual Consecutive Network for Human Pose Estimation(用于人体姿态估计的深度双重连续网络)
\n[paper](https://arxiv.org/abs/2103.07254) | [code](https://github.com/Pose-Group/DCPose)

\n\n[3] Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing(用于实例感知人类语义解析的可微分多粒度人类表示学习)
\n[paper](https://arxiv.org/pdf/2103.04570.pdf) | [code](https://github.com/tfzhou/MG-HumanParsing)

\n\n[2] CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild（野外自监督的单眼3D人类姿态估计）

\n\n[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透视作物层的3D姿势的几何感知神经重建）
\n[paper](https://arxiv.org/abs/2011.13607)

\n\n \n\n\n### 手势估计(Gesture Estimation)\n\n[6] Improving Sign Language Translation with Monolingual Data by Sign Back-Translation(【手势翻译】基于单语数据的手语翻译研究)
\n[paper](https://arxiv.org/abs/2105.12397)

\n\n[5] ContactOpt: Optimizing Contact to Improve Grasps(ContactOpt：优化联系人以提高抓地力)
\n[paper](https://arxiv.org/abs/2104.07267)

\n\n[4] Fingerspelling Detection in American Sign Language(美国手语中的手指拼写检测)
\n[paper](https://arxiv.org/abs/2104.01291)

\n\n[3] Read and Attend: Temporal Localisation in Sign Language Videos(阅读和参加：手语视频中的时间本地化)
\n[paper](https://arxiv.org/abs/2103.16481) | [project](https://www.robots.ox.ac.uk/ ̃vgg/research/bslattend/)

\n\n[2] Skeleton Based Sign Language Recognition Using Whole-body Keypoints(基于全身关键点的基于骨架的手语识别)
\n[paper](https://arxiv.org/abs/2103.08833) | [code](https://github.com/jackyjsy/CVPR21Chal-SLR)

\n\n[1] Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration(基于语义聚合和自适应2D-1D配准的相机空间手部网格恢复)
\n[paper](https://arxiv.org/pdf/2103.02845.pdf) | [code](https://github.com/SeanChenxy/HandMesh)

\n\n \n\n### 光流/位姿/运动估计(Optical Flow/Pose/Motion Estimation)\n\n[17] Exploring intermediate representation for monocular vehicle pose estimation(探索单目车辆姿态估计的中间表示)
\n[paper](https://arxiv.org/abs/2011.08464) ｜ [code](https://github.com/Nicholasli1995/EgoNet)

\n\n[16] Extreme Rotation Estimation using Dense Correlation Volumes(使用密集相关体积的极端旋转估计)
\n[paper](https://arxiv.org/abs/2104.13530) | [project](https://ruojincai.github.io/ExtremeRotation/)

\n\n[15] Motion Representations for Articulated Animation(【运动估计&表示】关节动画的运动表示)
\n[paper](https://arxiv.org/abs/2104.11280) | [code](https://github.com/snap-research/articulated-animation)

\n\n[14] Self-Supervised Pillar Motion Learning for Autonomous Driving(【运动估计】用于自动驾驶的自我监督支柱运动学习)
\n[paper](https://arxiv.org/abs/2104.08683)

\n\n[13] Single-view robot pose and joint angle estimation via render & compare(通过渲染和比较进行单视图机器人姿态和关节角度估计)
\n[paper](https://arxiv.org/abs/2104.09359) | [code](https://www.di.ens.fr/willow/research/robopose/)

\n\n[12] Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty(新旧融合：通过几何引导的不确定性学习相对相机姿势)
\n[paper](https://arxiv.org/abs/2104.08278)

\n\n[11] VOLDOR: Visual Odometry from Log-logistic Dense Optical flow Residuals(【视觉测距】VOLDOR：来自对数逻辑密集光流残差的视觉里程表)
\n[paper](https://arxiv.org/abs/2104.06789)

\n\n[10] DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-scale Consistency(【6D位姿估计】通过双尺度一致性学习6DoF对象姿势估计)
\n[paper](https://arxiv.org/abs/2104.03658)

\n\n[9] Learning optical flow from still images(【光流估计】从静止图像中学习光流)
\n[paper](https://arxiv.org/abs/2104.03965) | [project](https://mattpoggi.github.io/projects/cvpr2021aleotti/)

\n\n[8] Learning Optical Flow from a Few Matches(【光流估计】通过少量匹配学习光流)
\n[paper](https://arxiv.org/abs/2104.02166) | [code](https://github.com/zacjiang/scv)

\n\n[7] FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds(【光流估计】FESTA：场景点云通过时空注意进行光流估计)
\n[paper](https://arxiv.org/abs/2104.00798)

\n\n[6] Wide-Depth-Range 6D Object Pose Estimation in Space(【6D位姿估计】空间中的深度范围6D对象姿态估计)
\n[paper](https://arxiv.org/abs/2104.00337)

\n\n[5] Deep Two-View Structure-from-Motion Revisited(重新审视运动的深层两视图结构)
\n[paper](https://arxiv.org/abs/2104.00556)

\n\n[4] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism(【6D位姿估计】具有分离旋转机制的类别级6D对象姿势估计的快速基于形状的网络)
\n[paper](https://arxiv.org/abs/2103.07054) | [code](https://github.com/DC1991/FS_Net)

\n\n[3] GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation(【6D位姿估计】用于单眼6D对象姿态估计的几何引导直接回归网络)
\n[paper](https://arxiv.org/abs/2102.12145) | [code](https://github.com/THU-DA-6D-Pose-Group/GDR-Net)

\n\n[2] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中，通过空间划分的鲁棒神经路由可实现摄像机的重新定位)
\n[paper](https://arxiv.org/abs/2012.04746) | [project](https://ai.stanford.edu/~hewang/)

\n\n[1] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)
\n[paper](https://arxiv.org/pdf/2101.06605.pdf) | [code](https://github.com/huangjh-pub/multibody-sync)

\n\n \n\n### 深度估计(Depth Estimation)\n\n[18] MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera(MonoRec：在动态环境中从单个移动相机进行半监督密集重建)
\n[paper](https://arxiv.org/abs/2011.11814) | [project](https://vision.in.tum.de/research/monorec)

\n\n[17] Single Image Depth Estimation using Wavelet Decomposition(使用小波分解的单幅图像深度估计)
\n[paper](https://arxiv.org/abs/2106.02022) | [code](https://github.com/nianticlabs/wavelet-monodepth)

\n\n[16] Self-Supervised Multi-Frame Monocular Scene Flow(自监督多帧单眼场景流)
\n[paper](https://arxiv.org/abs/2105.02216) | [code](https://github.com/visinf/multi-mono-sf)

\n\n[15] Binary TTC: A Temporal Geofence for Autonomous Navigation(【接触时间估计】二进制TTC：自主导航的时空地理围栏)
\n[paper](https://arxiv.org/abs/2101.04777)

\n\n[14] The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth(时间机会主义者：自我监督的多帧单眼深度)
\n[paper](https://arxiv.org/abs/2104.14540)

\n\n[13] Lighting, Reflectance and Geometry Estimation from 360∘ Panoramic Stereo(360∘全景立体的光照、反射率和几何估计)
\n[paper](https://arxiv.org/abs/2104.09886) | [code](https://github.com/junxuan-li/LRG_360Panoramic)

\n\n[12] Depth Completion using Plane-Residual Representation(使用平面残差表示法的深度补全)
\n[paper](https://arxiv.org/abs/2104.07350)

\n\n[11] StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision(通过立体视觉进行深度感知的布景人类数字化)
\n[paper](https://arxiv.org/abs/2104.05289) | [projec](http://crishy1995.github.io/StereoPIFuProject)

\n\n[10] Self-supervised Learning of Depth Inference for Multi-view Stereo(多视图立体声深度推理的自我监督学习)
\n[paper](https://arxiv.org/abs/2104.02972) | [code](https://github.com/JiayuYANG/Self-supervised-CVP-MVSNet)

\n\n[9] Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries(遮挡边界处的深度补全和双曲面外推)
\n[paper](https://arxiv.org/abs/2104.02253)

\n\n[8] S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation(学习通用的深度特定的结构表示)
\n[paper](https://arxiv.org/abs/2104.00877) | [code](https://github.com/microsoft/S2R-DepthNet)

\n\n[7] RGB-D Local Implicit Function for Depth Completion of Transparent Objects(RGB-D局部隐式函数用于透明对象的深度补全)
\n[paper](https://arxiv.org/abs/2104.00622) | [code](https://research.nvidia.com/publication/2021-03_RGB-D-Local-Implicit)

\n\n[6] LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering(通过可分辨深度渲染进行单眼360布局估算)
\n[paper](https://arxiv.org/abs/2104.00568) | [project](https://fuenwang.ml/project/led2net)

\n\n[5] Deep Two-View Structure-from-Motion Revisited(重新审视运动的深层两视图结构)
\n[paper](https://arxiv.org/abs/2104.00556)

\n\n[4] Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging(学习微透镜掩模以在飞行时间成像中进行飞行像素校正)
\n[paper](https://arxiv.org/abs/2103.16693) | [project](https://light.princeton.edu/publication/mask-tof)

\n\n[3] Generalizing to the Open World: Deep Visual Odometry with Online Adaptation(推广到开放世界：具有在线适应功能的深度视觉里程表)
\n[paper](https://arxiv.org/abs/2103.15279)

\n\n[2] Beyond Image to Depth: Improving Depth Prediction using Echoes(超越图像深度：使用回声改善深度预测)
\n[paper](https://arxiv.org/pdf/2103.08468.pdf) | [code](https://krantiparida.github.io/projects/bimgdepth.html)

\n\n\n[1] PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss(具有神经位置编码和蒸馏消光损耗的自我监督单视图深度估计的像素级精度)
\n[paper](https://arxiv.org/abs/2103.07362)

\n\n
\n\n \n\n\n## 图像处理(Image Processing)\n\n[1] Invertible Image Signal Processing(可逆图像信号处理)
\n[paper](https://arxiv.org/abs/2103.15061) | [code](https://github.com/yzxing87/Invertible-ISP)

\n\n \n\n### 超分辨率(Super Resolution)\n\n[10] KOALAnet: Blind Super-Resolution using Kernel-Oriented Adaptive Local Adjustment(KOALAnet：使用面向内核的自适应局部调整的盲超分辨率)
\n[paper](https://arxiv.org/abs/2012.08103)

\n\n[9] BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond()
\n[paper](https://arxiv.org/abs/2012.02181)
\n\n[8] Temporal Modulation Network for Controllable Space-Time Video Super-Resolution(可控时空视频超分辨率的时间调制网络)
\n[paper](https://arxiv.org/abs/2104.10642) | [code](https://github.com/CS-GangXu/TMNet)

\n\n[7] SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation(SRWarp：任意变换下的广义图像超分辨率)
\n[paper](https://arxiv.org/abs/2104.10325)

\n\n[6] Unsupervised Degradation Representation Learning for Blind Super-Resolution(盲超分辨率的无监督退化表示学习)
\n[paper](https://arxiv.org/abs/2104.00416) | [code](https://github.com/LongguangWang/DASR)

\n\n[5] Flow-based Kernel Prior with Application to Blind Super-Resolution(基于流的内核先于盲超分辨率的应用)
\n[paper](https://arxiv.org/abs/2103.15977) | [code](https://github.com/JingyunLiang/FKP)

\n\n[4] ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic(通过数据特征加速超分辨率网络的通用框架)
\n[paper](https://arxiv.org/abs/2103.04039) | [解读-超分性能不降低，计算量降低50%：加速图像超分的ClassSR](https://zhuanlan.zhihu.com/p/355873199)

\n\n[3] Learning Continuous Image Representation with Local Implicit Image Function(通过局部隐含图像功能学习连续图像表示)
\n[paepr](https://arxiv.org/abs/2012.09161) | [code](https://github.com/yinboc/liif) | [video](https://youtu.be/6f2roieSY_8) | [project](https://yinboc.github.io/liif/)

\n\n[2] Data-Free Knowledge Distillation For Image Super-Resolution(DAFL算法的SR版本)

\n\n[1] AdderSR: Towards Energy Efficient Image Super-Resolution(将加法网路应用到图像超分辨率中)
\n[paper](https://arxiv.org/pdf/2009.08891.pdf) | [code](https://github.com/huawei-noah/AdderNet)
\n解读：[华为开源加法神经网络](https://zhuanlan.zhihu.com/p/113536045)

\n\n \n\n### 图像复原/图像增强(Image Restoration)\n\n[3] Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip Connection Network(利用动态跳跃连接网络消除显示下摄像机衍射图像伪影)
\n[paper](https://arxiv.org/abs/2104.09556)

\n\n[2] NeX: Real-time View Synthesis with Neural Basis Expansion(NeX：具有神经基础扩展的实时视图合成)
\n[paper](https://arxiv.org/abs/2103.05606) | [code](https://nex-mpi.github.io/)

\n\n[1] Multi-Stage Progressive Image Restoration(多阶段渐进式图像复原)
\n[paper](https://arxiv.org/abs/2102.02808) | [code](https://github.com/swz30/MPRNet)

\n\n\n \n\n### 图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)\n\n[3] From Shadow Generation to Shadow Removal(从阴影生成到阴影去除)
\n[paper](https://arxiv.org/abs/2103.12997)

\n\n[2] Robust Reflection Removal with Reflection-free Flash-only Cues(通过无反射的仅含Flash线索进行鲁棒的反射去除)
\n[paper](https://arxiv.org/pdf/2103.04273.pdf) | [code](https://github.com/ChenyangLEI/flash-reflection-removal)

\n\n[1] Auto-Exposure Fusion for Single-Image Shadow Removal(用于单幅图像阴影去除的自动曝光融合)
\n[paper](https://arxiv.org/abs/2103.01255) | [code](https://github.com/tsingqguo/exposure-fusion-shadow-removal)

\n\n\n\n \n\n### 图像去噪/去模糊/去雨去雾(Image Denoising)\n\n[8] FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise(FBI-Denoiser：泊松-高斯噪声的快速盲图像降噪器)
\n[paper](https://arxiv.org/abs/2105.10967) | [code](https://github.com/csm9493/FBI-Denoiser)

\n\n[7] Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments(弱光环境下用于摄影的闪光灯和非闪光灯对的深度降噪)
\n[paper](https://arxiv.org/abs/2012.05116) | [project](https://www.cse.wustl.edu/~zhihao.xia/deepfnf/)

\n\n[6] Digital Gimbal: End-to-end Deep Image Stabilization with Learnable Exposure Times(数码云台：具有可学习的曝光时间的端到端深度图像稳定)
\n[paper](https://arxiv.org/abs/2012.04515)

\n\n[5] Contrastive Learning for Compact Single Image Dehazing(紧凑型单图像去雾的对比学习)
\n[paper](https://arxiv.org/abs/2104.09367) | [code](https://github.com/GlassyWu/AECR-Net)

\n\n[4] Explore Image Deblurring via Blur Kernel Space(通过模糊内核空间探索图像去模糊)
\n[paper](https://arxiv.org/abs/2104.00317)

\n\n[3] Semi-Supervised Video Deraining with Dynamic Rain Generator(带动态雨水产生器的半监督视频去雨)
\n[paper](https://arxiv.org/abs/2103.07939)

\n\n[2] ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring(学习用于视频去模糊的全范围体积对应)
\n[paper](https://arxiv.org/pdf/2103.04260.pdf)

\n\n[1] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects(快速移动物体的去模糊和形状恢复)
\n[paper](https://arxiv.org/abs/2012.00595) | [code](https://github.com/rozumden/DeFMO) | [video](https://www.youtube.com/watch?v=pmAynZvaaQ4)

\n\n \n\n### 图像编辑/图像修复(Image Edit/Inpainting)\n\n[11] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(PD-GAN：用于图像修复的概率多样GAN)
\n[paper](https://arxiv.org/abs/2105.02201)

\n\n[10] StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing(StyleMapGAN：利用GAN中潜在的空间维度进行实时图像编辑)
\n[paper](https://arxiv.org/abs/2104.14754) | [code](https://github.com/naver-ai/StyleMapGAN)

\n\n[9] Image Inpainting with External-internal Learning and Monochromic Bottleneck(具有内在内在学习和单色瓶颈的图像修复)
\n[paper](https://arxiv.org/abs/2104.09068)

\n\n[8] TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations(通过合并多个颜色和空间变换进行参考引导的图像修复)\n[paper](https://arxiv.org/abs/2103.15982)

\n\n[7] DeFLOCNet: Deep Image Editing via Flexible Low-level Controls(通过灵活的低级控件进行深度图像编辑)
\n[paper](https://arxiv.org/abs/2103.12723)

\n\n[6] Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE(使用分层VQ-VAE生成图像修复的多样结构)
\n[paper](https://arxiv.org/pdf/2103.10022) | [code](https://github.com/USTC-JialunPeng/Diverse-Structure-Inpainting)

\n\n[5] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)
\n[paper](https://arxiv.org/abs/2103.04023) | [code](https://github.com/Zhangjinso/PISE)

\n\n[4] DeFLOCNet: Deep Image Editing via Flexible Low level Controls(通过灵活的低级控件进行深度图像编辑)

\n\n[3] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

\n\n[2] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)
\n[paper](https://arxiv.org/abs/2103.03243) | [code](https://github.com/mit-han-lab/anycost-gan)

\n\n[1] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潜在的空间维度进行实时图像编辑）

\n\n \n\n### 图像翻译(Image Translation)\n\n[9] CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation(CoCosNet v2：用于图像翻译的全分辨率函授学习)
\n[paper](https://arxiv.org/abs/2012.02047)

\n\n[8] Visualizing Adapted Knowledge in Domain Transfer(领域转移中适应性知识的可视化)
\n[paper](https://arxiv.org/abs/2104.10602) | [code](https://github.com/hou-yz/DA_visualization)

\n\n[7] Memory-guided Unsupervised Image-to-image Translation(内存引导的无监督图像到图像翻译)
\n[paper](https://arxiv.org/abs/2104.05170)

\n\n[6] ReMix: Towards Image-to-Image Translation with Limited Data(使用有限的数据实现图像到图像的翻译)
\n[paper](https://arxiv.org/abs/2103.16835)

\n\n[5] Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation(闭环：通过解图像翻译联合产生和去除雨水)
\n[paper](https://arxiv.org/abs/2103.13660)

\n \n[4] CoMoGAN: continuous model-guided image-to-image translation(连续的模型指导的图像到图像翻译)
\n[paper](https://arxiv.org/abs/2103.06879) | [code](http://github.com/cv-rits/CoMoGAN)

\n\n[3] Spatially-Adaptive Pixelwise Networks for Fast Image Translation(空间自适应像素网络，用于快速图像翻译)
\n[paper](https://arxiv.org/abs/2012.02992) | [project](https://tamarott.github.io/ASAPNet_web/)

\n\n[2] Image-to-image Translation via Hierarchical Style Disentanglement
\n[paper](https://arxiv.org/abs/2103.01456) | [code](https://github.com/imlixinyang/HiSD) | [解读-层次风格解耦：人脸多属性篡改终于可控了](https://zhuanlan.zhihu.com/p/354258056)

\n\n[1] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码：用于图像到图像翻译的StyleGAN编码器)
\n[paper](https://arxiv.org/abs/2008.00951) | [code](https://github.com/eladrich/pixel2style2pixel) | [project](https://eladrich.github.io/pixel2style2pixel/)

\n\n \n\n### 图像质量评估(Image Quality Assessment)\n\n[1] SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance(具有相似分布距离的无监督人脸图像质量评估)
\n[paper](https://arxiv.org/abs/2103.05977)

\n\n \n\n### 风格迁移(Style Transfer)\n\n[6] Style-Aware Normalized Loss for Improving Arbitrary Style Transfer(一种改进任意风格转换的风格感知归一化损失算法)
\n[paper](https://arxiv.org/abs/2104.10064)

\n\n[5] Instagram Filter Removal on Fashionable Images(删除时尚图片上的Instagram滤镜)
\n[paper](https://arxiv.org/abs/2104.05072)

\n\n[4] Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer(起草和修订：拉普拉斯金字塔网络，用于快速高质量的艺术风格转移)
\n[paper](https://arxiv.org/abs/2104.05376) | [code](https://github.com/PaddlePaddle/PaddleGAN/)

\n\n[3] Rethinking and Improving the Robustness of Image Style Transfer(重新思考和改善图像风格迁移的鲁棒性)
\n[paper](https://arxiv.org/abs/2104.05623)

\n\n[2] ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows(通过可逆神经流进行无偏的图像风格迁移)
\n[paper](https://arxiv.org/abs/2103.16877)

\n\n[1] Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes(重新考虑风格迁移：从像素到参数化笔触)
\n[paper](https://arxiv.org/abs/2103.17185)

\n\n
\n\n \n\n## 人脸(Face)\n\n[6] Continuous Face Aging via Self-estimated Residual Age Embedding(通过自我估计的残差年龄嵌入来实现连续的面部老化)
\n[paper](https://arxiv.org/abs/2105.00020)

\n\n[5] Towards High Fidelity Face Relighting with Realistic Shadows(逼真的阴影逼真的高保真面部)
\n[paper](https://arxiv.org/abs/2104.00825)

\n\n[4] Unsupervised Disentanglement of Linear-Encoded Facial Semantics(线性编码的面部语义的无监督解缠)
\n[paper](https://arxiv.org/abs/2103.16605)

\n\n[3] High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation(通过深度照明自适应实现AR / VR的高保真人脸跟踪)
\n[paper](https://arxiv.org/abs/2103.15876) | [project](https://www.cs.rochester.edu/u/lchen63)

\n\n[2] Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes(具有10^7个节点的大规模图上的结构感知人脸聚类)
\n[paper](https://arxiv.org/abs/2103.13225) | [code&project](https://sstzal.github.io/STAR-FC/)

\n\n[1] SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance(具有相似分布距离的无监督人脸图像质量评估)
\n[paper](https://arxiv.org/abs/2103.05977)

\n\n \n\n### 人脸识别/检测(Facial Recognition/Detection)\n\n[12] Dynamic Class Queue for Large Scale Face Recognition In the Wild(野外大规模人脸识别的动态类队列)
\n[paper](https://arxiv.org/abs/2105.11113) | [code](https://github.com/bilylee/DCQ)

\n\n[11] Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition(特征分解与重构学习对有效的面部表情识别)
\n[paper](https://arxiv.org/abs/2104.05160)

\n\n[10] FACESEC: A Fine-grained Robustness Evaluation Framework for Face Recognition Systems(FACESEC：用于人脸识别系统的细粒度鲁棒性评估框架)
\n[paper](https://arxiv.org/abs/2104.04107)

\n\n[9] IronMask: Modular Architecture for Protecting Deep Face Template(用于保护深脸模板的模块化体系结构)
\n[paper](https://arxiv.org/abs/2104.02239)

\n\n[8] HLA-Face: Joint High-Low Adaptation for Low Light Face Detection(用于低光人脸检测的联合高低适应)
\n[paper](https://arxiv.org/abs/2104.01984) | [project](https://daooshee.github.io/HLA-Face-Website/)

\n\n[7] Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition(潜入歧义：面部表情识别的潜在分布挖掘和成对不确定性估计)
\n[paper](https://arxiv.org/abs/2104.00232)

\n\n[6] Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition(情感过程：情感和面部表情识别的时态随机模型)
\n[paper](https://arxiv.org/abs/2103.13372)

\n\n[5] Cross-Domain Similarity Learning for Face Recognition in Unseen Domains(跨域相似性学习在未知领域中的人脸识别)
\n[paper](https://arxiv.org/abs/2103.07503)

\n\n[4] MagFace: A Universal Representation for Face Recognition and Quality Assessment(MagFace：人脸识别和质量评估的通用表示形式)
\n[paper](https://arxiv.org/abs/2103.06627) | [code](https://github.com/IrvingMeng/MagFace)

\n\n[3] CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement(用于模型不可知的面部检测细化的置信度排名)
\n[paper](https://arxiv.org/abs/2103.07017)

\n\n[2] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)
\n[paper](https://arxiv.org/pdf/2012.10545.pdf)

\n\n[1] WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition(揭示了百万级深度人脸识别力量的基准测试)
\n[paper](https://arxiv.org/abs/2103.04098) | [benchmark](https://www.face-benchmark.org/)

\n\n \n\n### 人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)\n\n[14] Lifting 2D StyleGAN for 3D-Aware Face Generation(提升 2D StyleGAN 以生成 3D 感知人脸)
\n[paper](https://arxiv.org/abs/2011.13126)

\n\n[13] Monocular Real-time Full Body Capture with Inter-part Correlations(具有部分间相关性的单眼实时全身捕获)
\n[paper](https://arxiv.org/abs/2012.06087)

\n\n[12] Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation(基于隐式模块化视听表示的姿态可控人脸生成)
\n[paper](https://arxiv.org/abs/2104.11116) | [code](https://hangz-nju-cuhk.github.io/projects/PC-AVS)

\n\n[11] Audio-Driven Emotional Video Portraits(音频情感视频肖像)
\n[paper](https://arxiv.org/abs/2104.07452)

\n\n[10] Pixel Codec Avatars(像素编解码器头像)
\n[paper](https://arxiv.org/abs/2104.04638)

\n\n[9] Riggable 3D Face Reconstruction via In-Network Optimization(通过网络内优化进行可操纵的3D人脸重建)
\n[paper](https://arxiv.org/abs/2104.03493) | [code](https://github.com/zqbai-jeremy/INORig)

\n\n[8] Everything's Talkin': Pareidolia Face Reenactment(一切都在说话'：帕累多利亚脸部重现)
\n[paper](https://arxiv.org/abs/2104.03061) | [project](https://wywu.github.io/projects/ETT/ETT.html)\n\n[7] High-Fidelity and Arbitrary Face Editing(高保真和任意脸部编辑)
\n[paper](https://arxiv.org/abs/2103.15814)

\n\n[6] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)
\n[paper](https://arxiv.org/pdf/2103.08204.pdf) | [project](https://qiuyuda.github.io/3DCaricShop/)

\n\n[5] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis(进行全面伪造分析的多功能基准)
\n[paper](https://arxiv.org/abs/2103.05630) | [code](https://yinanhe.github.io/projects/forgerynet.html)

\n\n[4] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)
\n[paper](https://arxiv.org/abs/2103.01456) | [code](https://github.com/imlixinyang/HiSD)

\n\n[3] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时：一个多任务学习框架)
\n[paper](https://arxiv.org/abs/2103.01520) | [code](https://github.com/Hzzone/MTLFace)

\n\n[2] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)
\n[paper](https://arxiv.org/abs/2103.04023) | [code](https://github.com/Zhangjinso/PISE)

\n\n[1] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)
\n[paper](https://arxiv.org/pdf/2012.13253.pdf) | [code](https://github.com/taldatech/soft-intro-vae-pytorch) | [project](https://taldatech.github.io/soft-intro-vae-web/)

\n\n \n\n### 人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)\n\n[6] Improving the Efficiency and Robustness of Deepfakes Detection through Precise Geometric Features(通过精确的几何特征提高假脸检测的效率和鲁棒性)
\n[paper](https://arxiv.org/abs/2104.04480)

\n\n[5] Face Forensics in the Wild(人脸伪造数据集)
\n[paper](https://arxiv.org/abs/2103.16076) | [code](https://github.com/tfzhou/FFIW)

\n\n[4] Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection(【人脸伪造检测】由单中心损失监督的频率感知判别特征学习，用于人脸伪造检测)
\n[paper](https://arxiv.org/abs/2103.09096)

\n\n[3] MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes(面罩引导的检测和重建，以防御深造假)
\n[paper](https://arxiv.org/abs/2103.14211)

\n\n[2] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(跨模态焦点损失，用于RGBD人脸反欺骗)\n[paper](https://arxiv.org/abs/2103.00948)

\n\n[1] Multi-attentional Deepfake Detection(多注意的Deepfake检测)
\n[paper](https://arxiv.org/abs/2103.02406)

\n\n\n\n
\n\n \n\n## 目标跟踪(Object Tracking)\n\n[18] Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation(Alpha-Refine：通过精确的边界框估计来提高跟踪性能)
\n[paper](https://arxiv.org/abs/2012.06815) | [code](https://github.com/MasterBin-IIAU/AlphaRefine)

\n\n[17] LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search(LightTrack：通过一站式架构搜索找到用于跟踪对象的轻型神经网络)
\n[paper](https://arxiv.org/abs/2104.14545) ｜ [code](https://github.com/researchmm/LightTrack)

\n\n[16] Multiple Object Tracking with Correlation Learning(相关学习的多目标跟踪)
\n[paper](https://arxiv.org/abs/2104.03541)

\n\n[15] Learning to Track Instances without Video Annotations(学习在没有视频注释的情况下跟踪实例)
\n[paper](https://arxiv.org/abs/2104.00287)

\n\n[14] STMTrack: Template-free Visual Tracking with Space-time Memory Networks(具有时空存储网络的无模板视觉跟踪)
\n[paper](https://arxiv.org/abs/2104.00324) | [code](https://github.com/fzh0917/STMTrack)

\n\n[13] Online Multiple Object Tracking with Cross-Task Synergy(具有跨任务协同作用的在线多对象跟踪)
\n[paper](https://arxiv.org/abs/2104.00380) | [code](https://github.com/songguocode/TADAM)

\n\n[12] Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark(使用自然语言实现更灵活，准确的对象跟踪：算法和基准)
\n[paper](https://arxiv.org/abs/2103.16746)

\n\n[11] Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking(可学习的图匹配：将图分区与深度特征学习相结合以实现多对象跟踪)
\n[paper](https://arxiv.org/abs/2103.16178) | [code](https://github.com/jiaweihe1996/GMTracker)

\n\n[10] IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking(IoU攻击：针对视觉对象跟踪的临时相干黑盒对抗攻击)
\n[paper](https://arxiv.org/abs/2103.14938) | [code](https://github.com/VISION-SJTU/IoUattack)

\n\n[9] Transformer Tracking(Transformer跟踪)
\n[paper](https://arxiv.org/abs/2103.15436) | [code](https://github.com/chenxin-dlut/TransT)

\n\n[8] Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking(Transformer与追踪器相遇：利用时间上下文进行可靠的视觉追踪)
\n[paper](https://arxiv.org/pdf/2103.11681)

\n\n[7] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段：在线多目标跟踪器)
\n[paper](https://arxiv.org/abs/2103.08808) | [code](https://jialianwu.com/projects/TraDeS.html)

\n\n[6] Learning a Proposal Classifier for Multiple Object Tracking(用于多对象跟踪的分类器)
\n[paper](https://arxiv.org/abs/2103.07889) | [code](https://github.com/daip13/LPC_MOT.git)

\n\n[5] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)
\n[paper](https://arxiv.org/abs/2006.11275) | [code](https://github.com/tianweiy/CenterPoint)

\n\n[4] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)

\n\n[3] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段：在线多对象跟踪器)
\n[project](https://jialianwu.com/projects/TraDeS.html) | [video](https://www.youtube.com/watch?v=oGNtSFHRZJA)

\n\n[2] Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking(多目标跟踪的概率小波计分和修复)
\n[paper](https://arxiv.org/abs/2012.02337)

\n\n[1] Rotation Equivariant Siamese Networks for Tracking（旋转等距连体网络进行跟踪）
\n[paper](https://arxiv.org/abs/2012.13078)

\n\n\n

\n \n\n## 图像&视频检索/理解(Image&Video Retrieval/Video Understanding)\n\n\n[11] VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval(VIGOR：超越一对一检索的交叉视图图像地理定位)
\n[paper](https://arxiv.org/abs/2011.12172) | [dataset&project](https://github.com/Jeff-Zilence/VIGOR)

\n\n[10] Compatibility-aware Heterogeneous Visual Search(兼容感知的异构视觉搜索)
\n[paper](https://arxiv.org/abs/2105.06047)

\n\n[9] 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition(2D还是2D？自适应3D卷积选择以实现有效的视频识别)
\n[paper](https://arxiv.org/abs/2012.14950)

\n\n[8] FrameExit: Conditional Early Exiting for Efficient Video Recognition(【视频理解】帧退出：有条件提前退出以实现有效的视频识别)
\n[paper](https://arxiv.org/abs/2104.13400)

\n\n[7] T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval(T2VLAD：用于文本视频检索的全局局部序列比对)
\n[paper](https://arxiv.org/abs/2104.10054)

\n\n[6] Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers(快速思考和缓慢思考：使用变压器进行高效的文本到视觉检索)
\n[paper](https://arxiv.org/abs/2103.16553)

\n\n[5] StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval(StyleMeUp：迈向与风格无关的基于草图的图像检索)
\n[paper](https://arxiv.org/abs/2103.15706)

\n\n[4] More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval(您只需要更多照片：基于半监督学习的细粒度基于草图的图像检索)
\n[paper](https://arxiv.org/abs/2103.13990) | [code](https://github.com/AyanKumarBhunia/semisupervised-FGSBIR)

\n\n[3] Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning(使用分层Transformer和自我监督学习改进跨模态食谱检索)
\n[paper](https://arxiv.org/pdf/2103.13061.pdf)

\n\n[2] On Semantic Similarity in Video Retrieval(视频检索中的语义相似度)
\n[paper](https://arxiv.org/abs/2103.10095) ｜ [code](https://mwray.github.io/SSVR/)

\n\n[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)
\n[paper](https://arxiv.org/abs/2103.02927)

\n\n\n\n \n\n### 行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)\n\n[25] Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization(通过对比交叉视图互信息最大化学习视图解开人体姿势表示)
\n[paper](https://arxiv.org/abs/2012.01405) | [code](https://github.com/google-research/google-research/tree/master/poem)

\n\n[24] Anticipating human actions by correlating past with the future with Jaccard similarity measures(用Jaccard相似性测度预测人类行为)
\n[paper](https://arxiv.org/abs/2105.12414)

\n\n[23] Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation(语言查询视频演员分割的协作时空建模)
\n[paper](https://arxiv.org/abs/2105.06818)
\n\n[22] Home Action Genome: Cooperative Compositional Action Understanding(家庭行动基因组：合作组成行动的理解)
\n[paper](https://arxiv.org/abs/2105.05226)

\n\n[21] Weakly Supervised Action Selection Learning in Video(视频中的弱监督动作选择学习)
\n[paper](https://arxiv.org/abs/2105.02439) | [code](https://github.com/layer6ai-labs/ASL)

\n\n[20] Global2Local: Efficient Structure Search for Video Action Segmentation()
\n[paper](https://arxiv.org/abs/2101.00910) | [code](https://github.com/ShangHua-Gao/G2L-search)

\n\n[19] Self-Supervised Learning for Semi-Supervised Temporal Action Proposal(自我监督学习的半监督时间行动建议)
\n[paper](https://arxiv.org/abs/2104.03214)

\n\n[18] Anchor-Constrained Viterbi for Set-Supervised Action Segmentation(锚约束维特比用于集合监督的动作分割)
\n[paper](https://arxiv.org/abs/2104.02113)

\n\n[17] Action Shuffle Alternating Learning for Unsupervised Action Segmentation(动作洗牌交替学习，实现无监督动作分割)
\n[paper](https://arxiv.org/abs/2104.02116)

\n\n[16] Self-supervised Motion Learning from Static Images(从静态图像进行自我监督的运动学习)
\n[paper](https://arxiv.org/abs/2104.00240)

\n\n[15] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning(带有片段对比学习的弱监督实时动作定位)\n[paper](https://arxiv.org/abs/2103.16392)

\n\n[14] Recognizing Actions in Videos from Unseen Viewpoints(从看不见的角度识别视频中的动作)
\n[paper](https://arxiv.org/abs/2103.16516)

\n\n[13] No frame left behind: Full Video Action Recognition(没有残影：完整的视频动作识别)
\n[paper](https://arxiv.org/abs/2103.15395)

\n\n[12] Learning Salient Boundary Feature for Anchor-free Temporal Action Localization(学习显着边界特征以实现无锚时间动作定位)
\n[paper](https://arxiv.org/abs/2103.13137) | [code](https://github.com/TencentYoutuResearch/ActionDetection-AFSD)

\n\n[11] Temporal Context Aggregation Network for Temporal Action Proposal Refinement(时间上下文聚合网络，用于改进时间行动建议)
\n[paper](https://arxiv.org/abs/2103.13141)

\n\n[10] The Blessings of Unlabeled Background in Untrimmed Videos(未修饰视频中未标记背景的祝福)
\n[paper](https://arxiv.org/abs/2103.13183)

\n\n[9] Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation(临时加权层次聚类，实现无监督动作分割)
\n[paper](https://arxiv.org/abs/2103.11264) | [code](https://github.com/ssarfraz/FINCH-Clustering/tree/master/TW-FINCH)

\n\n[8] Coarse-Fine Networks for Temporal Activity Detection in Videos(粗细网络，用于视频中的时间活动检测)
\n[paper](https://arxiv.org/abs/2103.01302)

\n\n[7] Learning Discriminative Prototypes with Dynamic Time Warping(通过动态时间扭曲学习判别性原型)
\n[paper](https://arxiv.org/pdf/2103.09458.pdf)

\n\n[6] Temporal Action Segmentation from Timestamp Supervision(时间监督中的时间动作分割)
\n[paper](https://arxiv.org/abs/2103.06669)

\n\n[5] ACTION-Net: Multipath Excitation for Action Recognition(用于动作识别的多路径激励)
\n[paper](https://arxiv.org/abs/2103.07372) ｜ [code](https://github.com/V-Sense/ACTION-Net)

\n\n[4] BASAR:Black-box Attack on Skeletal Action Recognition(骨骼动作识别的黑匣子攻击)
\n[paper](https://arxiv.org/abs/2103.05266)

\n\n[3] Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack(了解对抗攻击下基于骨骼的动作识别的鲁棒性)
\n[paper](https://arxiv.org/pdf/2103.05347.pdf)

\n\n[2] Temporal Difference Networks for Efficient Action Recognition(用于有效动作识别的时差网络)
\n[paper](https://arxiv.org/abs/2012.10071) | [code](https://github.com/MCG-NJU/TDN)

\n\n[1] Behavior-Driven Synthesis of Human Dynamics(行为驱动的人类动力学综合)
\n[paper](https://arxiv.org/pdf/2103.04677.pdf) | [code](https://compvis.github.io/behavior-driven-video-synthesis/)<>\n\n\n \n\n### 行人重识别/检测(Re-Identification/Detection)\n\n[15] Generalizable Person Re-identification with Relevance-aware Mixture of Experts(具有相关性感知混合专家的可泛化的行人重识别)
\n[paper](https://arxiv.org/abs/2105.09156)

\n\n[14] Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification(通过基于记忆的多源元学习来学习概括看不见的域以进行人员重新识别)
\n[paper](https://arxiv.org/abs/2012.00417)

\n\n[13] Joint Generative and Contrastive Learning for Unsupervised Person Re-identification(联合生成和对比学习，用于无监督人员重新识别)
\n[paper](https://arxiv.org/abs/2012.09071) | [code](https://github.com/chenhao2345/GCL)

\n\n[12] Unsupervised Pre-training for Person Re-identification(对人员进行重新识别的无监督预训练)
\n[paper](https://arxiv.org/abs/2012.03753)

\n\n[11] BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification(BiCnet-TKS：学习有效的时空表示以重新识别视频人)
\n[paper](https://arxiv.org/abs/2104.14783)

\n\n[10] Unsupervised Multi-Source Domain Adaptation for Person Re-Identification(用于行人重新识别的无监督多源域适配)
\n[paper](https://arxiv.org/abs/2104.12961)

\n\n[9] Combined Depth Space based Architecture Search For Person Re-identification(基于组合深度空间的架构搜索以进行行人重识别)
\n[paper](https://arxiv.org/abs/2104.04163)

\n\n[8] Neural Feature Search for RGB-Infrared Person Re-Identification(神经特征搜索以重新识别RGB红外人)
\n[paper](https://arxiv.org/abs/2104.02366)

\n\n[7] Group-aware Label Transfer for Domain Adaptive Person Re-identification(组感知标签传输，用于域自适应行人重识别)
\n[paper](https://arxiv.org/abs/2103.12366)

\n\n[6] Lifelong Person Re-Identification via Adaptive Knowledge Accumulation(通过自适应知识积累对终身行人重识别)
\n[paper](https://arxiv.org/abs/2103.12462)

\n\n[5] Anchor-Free Person Search(Anchor-Free行人搜索)
\n[paper](https://arxiv.org/abs/2103.11617) | [code](https://github.com/daodaofr/AlignPS)

\n\n[4] Intra-Inter Camera Similarity for Unsupervised Person Re-Identification(摄像机内部相似度用于无监督人员重新识别)
\n[paper](https://arxiv.org/abs/2103.11658)

\n\n[3] Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification(基于视频的人员重新识别的全球指导对等学习)
\n[paper](https://arxiv.org/abs/2103.04337)

\n\n[2] Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification(联合抗噪学习和元相机移位自适应，用于无监督人员的重新识别)
\n[paper](https://arxiv.org/abs/2103.04618)

\n\n[1] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)
\n[paper](https://arxiv.org/abs/2011.14670)

\n\n \n\n### 图像/视频字幕(Image/Video Caption)\n\n[7] Towards Accurate Text-based Image Captioning with Content Diversity Exploration(借助内容多样性探索实现精确的基于文本的图像字幕)
\n[paper](https://arxiv.org/abs/2105.03236)

\n\n[6] Human-like Controllable Image Captioning with Verb-specific Semantic Roles(具有动词特定语义作用的类人可控图像字幕)
\n[paper](https://arxiv.org/abs/2103.12204) | [code](https://github.com/mad-red/VSR-guided-CIC)

\n\n[5] Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos(语义注意的共同接地网络，用于引用视频中的表达理解)
\n[paper](https://arxiv.org/pdf/2103.12346.pdf) | [project](https://sijiesong.github.io/co-grounding)

\n\n[4] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕：从组织病理学教科书和文章中学习表示形式)
\n[paper](https://arxiv.org/pdf/2103.05121.pdf)

\n\n[3] Open-book Video Captioning with Retrieve-Copy-Generate Network(带有检索复制生成网络的开卷视频字幕)
\n[paper](https://arxiv.org/pdf/2103.05284.pdf)

\n\n[2] VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs(基于视频的文本生成的端到端学习来自多模式输入)
\n[paper](https://arxiv.org/pdf/2101.12059.pdf)

\n\n[1] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans(：RGB-D扫描中的上下文感知密集字幕)\n[paper](https://arxiv.org/abs/2012.02206) | [code](https://github.com/daveredrum/Scan2Cap) | [project](https://daveredrum.github.io/Scan2Cap/) | [video](https://youtu.be/AgmIpDbwTCY)

\n\n\n
\n\n \n\n## 医学影像(Medical Imaging)\n\n[14] DARCNN: Domain Adaptive Region-based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images(DARCNN：用于生物医学图像中无监督实例分割的基于域自适应区域的卷积神经网络)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Hsu_DARCNN_Domain_Adaptive_Region-Based_Convolutional_Neural_Network_for_Unsupervised_Instance_CVPR_2021_paper.pdf)

\n\n[13] Every Annotation Counts: Multi-label Deep Supervision for Medical Image Segmentation(每种注释都至关重要：医学图像分割的多标签深度监管)
\n[paper](https://arxiv.org/abs/2104.13243)

\n\n[12] DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation(DiNTS：用于3D医学图像分割的可区分神经网络拓扑搜索)
\n[paper](https://arxiv.org/abs/2103.15954)

\n\n[11] Confluent Vessel Trees with Accurate Bifurcations(分叉的融合容器树)\n[paper](https://arxiv.org/abs/2103.14268)

\n\n[10] Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net(无监督多元规范CSCℓ4Net的脑图像合成)
\n[paper](https://arxiv.org/pdf/2103.11587.pdf)

\n\n[9] XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations(使用全局和局部解释诊断胸部X光片)
\n[paper](https://arxiv.org/pdf/2103.10663.pdf)

\n\n[8] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)
\n[paper](https://arxiv.org/abs/2103.06030) | [code](https://github.com/liuquande/FedDG-ELCFS)

\n\n[7] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕：从组织病理学教科书和文章中学习表示形式)
\n[paper](https://arxiv.org/pdf/2103.05121.pdf)

\n\n[6] Discovering Hidden Physics Behind Transport Dynamics(在运输动力学背后发现隐藏物理)
\n[paper](https://arxiv.org/abs/2011.12222)

\n\n[5] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)
\n[paper](https://arxiv.org/abs/2103.02772)

\n\n[4] Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning(多机构协作改进基于深度学习的联合学习磁共振图像重建)
\n[paper](https://arxiv.org/abs/2103.02148) | [code](https://github.com/guopengf/FLMRCM)

\n\n[3] 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management(用于胰腺肿块分割，诊断和定量患者管理的3D图形解剖学几何集成网络)

\n\n[2] Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies(深部病变追踪器：在4D纵向成像研究中监控病变)
\n[paper](https://arxiv.org/abs/2012.04872)

\n\n[1] Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization(通过脊柱矫正和解剖学约束优化在CT中自动进行椎骨定位和识别)
\n[paper](https://arxiv.org/abs/2012.07947)

\n\n
\n\n \n\n\n## 文本检测/识别(Text Detection/Recognition)\n\n[7] TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text(TextOCR：对任意形状的场景文本进行大规模的端到端推理)
\n[paper](https://arxiv.org/abs/2105.05486) | [project](https://textvqa.org/textocr)

\n\n[6] Fourier Contour Embedding for Arbitrary-Shaped Text Detection(基于Fourier轮廓嵌入的任意形状文本检测)
\n[paper](https://arxiv.org/abs/2104.10442)

\n\n[5] Scene Text Retrieval via Joint Text Detection and Similarity Learning(通过联合文本检测和相似性学习检索场景文本)
\n[paper](https://arxiv.org/abs/2104.01552) | [code](https://github.com/lanfeng4659/STR-TDSL)

\n\n[4] MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition(迈向写作者自适应的手写文本识别)
\n[paper](https://arxiv.org/abs/2104.01876)

\n\n[3] MOST: A Multi-Oriented Scene Text Detector with Localization Refinement(具有本地化优化功能的多方位场景文本检测器)
\n[paper](https://arxiv.org/abs/2104.01070)

\n\n[2] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition(像人类一样阅读：用于场景文本识别的自主，双向和迭代语言建模)
\n[paper](https://arxiv.org/abs/2103.06495) | [code](https://github.com/FangShancheng/ABINet)

\n\n[1] What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels(如果我们仅将真实数据集用于场景文本识别该怎么办？带有较少标签的场景文本识别)
\n[paepr](https://arxiv.org/abs/2103.04400) | [code](https://github.com/ku21fan/STR-Fewer-Labels)

\n\n
\n\n \n\n## 遥感图像(Remote Sensing Image)\n\n[3] SIPSA-Net: Shift-Invariant Pan Sharpening with Moving Object Alignment for Satellite Imagery(SIPSA-Net：带有移动目标对准的卫星图像平移不变锐化)
\n[paper](https://arxiv.org/abs/2105.02400)

\n\n[2] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)
\n[paper](https://arxiv.org/pdf/2103.06564.pdf)

\n\n[1] Deep Gradient Projection Networks for Pan-sharpening(【超分辨率】泛锐化的深梯度投影网络)
\n[paper](https://arxiv.org/pdf/2103.04584.pdf) | [code](https://github.com/xsxjtu/GPPNN)

\n\n\n
\n\n \n\n## GAN/生成式/对抗式(GAN/Generative/Adversarial)\n\n[27] Enhancing the Transferability of Adversarial Attacks through Variance Tuning(通过方差调整增强对抗性攻击的可转移性)
\n[paper](https://arxiv.org/abs/2103.15571) | [code](https://github.com/JHL-HUST/VT)

\n\n[26] HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms(HistoGAN：通过颜色直方图控制 GAN 生成的图像和真实图像的颜色)
\n[paper](https://arxiv.org/abs/2011.11731)

\n\n[25] Continuous Face Aging via Self-estimated Residual Age Embedding(通过自我估计的残差年龄嵌入来实现连续的面部老化)
\n[paper](https://arxiv.org/abs/2105.00020)

\n\n[24] Unsupervised 3D Shape Completion through GAN Inversion(通过GAN反演实现无监督3D形状补全)
\n[paper](https://arxiv.org/abs/2104.13366) | [project](https://junzhezhang.github.io/projects/ShapeInversion/)

\n\n[23] Delving into Data: Effectively Substitute Training for Black-box Attack(深入研究数据：有效替代黑盒攻击的培训)
\n[paper](https://arxiv.org/abs/2104.12378)

\n\n[22] LAFEAT: Piercing Through Adversarial Defenses with Latent Features(LAFEAT：通过具有潜在功能的对抗性防御突围)
\n[paper](https://arxiv.org/abs/2104.09284)

\n\n[21] Surrogate Gradient Field for Latent Space Manipulation(潜在空间操纵的替代梯度场)
\n[paper](https://arxiv.org/abs/2104.09065)

\n\n[20] DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort(DatasetGAN：只需最少的人力即可获得的高效标签数据工厂)
\n[paper](https://arxiv.org/abs/2104.06490)

\n\n[19] Regularizing Generative Adversarial Networks under Limited Data(在有限数据下对生成性对抗网络进行正则化)
\n[paper](https://arxiv.org/abs/2104.03310) | [project](https://hytseng0509.github.io/lecam-gan) | [code](https://github.com/google/lecam-gan)

\n\n[18] Content-Aware GAN Compression(内容感知GAN压缩)
\n[paper](https://arxiv.org/abs/2104.02244)

\n\n[17] Lipstick ain't enough: Beyond Color Matching for In-the-Wild Makeup Transfer(口红还不够：超出配色范围的野外化妆效果)
\n[paper](https://arxiv.org/abs/2104.01867) | [code](https://github.com/VinAIResearch/CPM)

\n\n[16] LiBRe: A Practical Bayesian Approach to Adversarial Detection(LiBRe：对抗性检测的实用贝叶斯方法)
\n[paper](https://arxiv.org/abs/2103.14835)

\n\n[15] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)
\n[paper](https://arxiv.org/abs/2103.07893)

\n\n[14] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)
\n[paper](https://arxiv.org/abs/2103.06878) | [code](https://github.com/tzt101/INADE.git)

\n\n[13] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)
\n[paper](https://arxiv.org/abs/2103.06902)

\n\n[12] MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks(模拟未知目标模型以提高查询效率的黑盒攻击)
\n[paper](https://arxiv.org/abs/2009.00960) | [code](https://github.com/machanic/MetaSimulator)

\n\n[11] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)
\n[paper](https://arxiv.org/pdf/2012.13253.pdf) | [code](https://github.com/taldatech/soft-intro-vae-pytorch) | [project](https://taldatech.github.io/soft-intro-vae-web/)

\n\n[10] LOHO: Latent Optimization of Hairstyles via Orthogonalization(LOHO：通过正交化潜在地优化发型)
\n[paper](https://arxiv.org/pdf/2103.03891.pdf)

\n\n[9] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)
\n[paper](https://arxiv.org/abs/2103.04023) | [code](https://github.com/Zhangjinso/PISE)

\n\n[8] Closed-Form Factorization of Latent Semantics in GANs(GAN中潜在语义的闭式分解)
\n[paper](https://arxiv.org/abs/2007.06600) | [code](https://github.com/genforce/sefa)

\n\n[7] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

\n\n[6] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)
\n[paper](https://arxiv.org/abs/2103.03243) | [code](https://github.com/mit-han-lab/anycost-gan)

\n\n[5] Efficient Conditional GAN Transfer with Knowledge Propagation across Classes(高效的有条件GAN转移以及跨课程的知识传播)
\n[paper](https://arxiv.org/pdf/2102.06696.pdf) | [code](http://github.com/mshahbazi72/cGANTransfer)

\n\n[4] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潜在的空间维度进行实时图像编辑）

\n\n[3] Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs(Hijack-GAN：意外使用经过预训练的黑匣子GAN)
\n[paper](https://arxiv.org/pdf/2011.14107.pdf)

\n\n[2] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码：用于图像到图像翻译的StyleGAN编码器)
\n[paper](https://arxiv.org/abs/2008.00951) | [code](https://github.com/eladrich/pixel2style2pixel) | [project](https://eladrich.github.io/pixel2style2pixel/)

\n\n[1] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)
\n[paper](https://arxiv.org/pdf/2012.10545.pdf)

\n\n\n
\n\n \n\n### 图像生成/图像合成(Image Generation/Image Synthesis)\n\n[26] Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On(通过虚拟试穿的生成式3D服装模型进行自我监督的碰撞处理)
\n[paper](https://arxiv.org/abs/2105.06462) ｜ [project](http://mslab.es/projects/SelfSupervisedGarmentCollisions)

\n\n[25] Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses(Roof-GAN：学习为住宅生成屋顶几何图形和关系)
\n[paper](https://arxiv.org/abs/2012.09340) | [code](https://github.com/yi-ming-qian/roofgan)

\n\n[24] TediGAN: Text-Guided Diverse Face Image Generation and Manipulation(TediGAN：文本引导的多样面部图像生成和操纵)
\n[paper](https://arxiv.org/abs/2012.03308) | [code](https://github.com/weihaox/TediGAN) | [video](https://youtu.be/L8Na2f5viAM)

\n\n[23] GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving(GeoSim：通过可自动驾驶的几何感知合成进行逼真的视频模拟)
\n[paper](https://arxiv.org/abs/2101.06543)

\n\n[22] GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields(GIRAFFE：将场景表示为合成的生成神经特征场)
\n[paper](https://arxiv.org/abs/2011.12100) | [project](http://bit.ly/giraffe-project)

\n\n[21] Ensembling with Deep Generative Views(融入深刻的生成观点)
\n[paper](https://arxiv.org/abs/2104.14551) | [code](https://github.com/chail/gan-ensembling)

\n\n[20] StylePeople: A Generative Model of Fullbody Human Avatars(StylePeople：全身人类化身的生成模型)
\n[paper](https://arxiv.org/abs/2104.08363) | [code](http://saic-violet.github.io/style-people)
\n\n[19] See through Gradients: Image Batch Recovery via GradInversion(透视渐变：通过GradInversion恢复图像批处理)
\n[paper](https://arxiv.org/abs/2104.07586)

\n\n[18] StEP: Style-based Encoder Pre-training for Multi-modal Image Synthesis(StEP：用于多模式图像合成的基于样式的编码器预训练)
\n[paper](https://arxiv.org/abs/2104.07098)

\n\n[17] Few-shot Image Generation via Cross-domain Correspondence(通过跨域对应小样本图像生成)
\n[paper](https://arxiv.org/abs/2104.06820)

\n\n[16] IMAGINE: Image Synthesis by Image-Guided Model Inversion(想象：通过图像指导模型反演的图像合成)
\n[paper](https://arxiv.org/abs/2104.05895)

\n\n[15] Variational Transformer Networks for Layout Generation(用于布局生成的变电站网络)
\n[paper](https://arxiv.org/abs/2104.02416)

\n\n[14] VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization(通过未对准感知的归一化进行高分辨率的虚拟试戴)
\n[paper](https://arxiv.org/abs/2103.16874)

\n\n[13] A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection(仔细研究CNN生成图像检测的傅立叶光谱差异)
\n[paper](https://arxiv.org/abs/2103.17195) | [code](https://keshik6.github.io/Fourier-Discrepancies-CNN-Detection/)

\n\n[12] Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans(用于3D人类的高分辨率可编辑纹理的半监督合成)
\n[paper](https://arxiv.org/abs/2103.17266)

\n\n[11] Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling(个性化几何和纹理建模的少量人体运动传递)
\n[paper](https://arxiv.org/abs/2103.14338) | [code](https://github.com/HuangZhiChao95/FewShotMotionTransfer)

\n\n[10] Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net(无监督多元规范CSCℓ4Net的脑图像合成)
\n[paper](https://arxiv.org/pdf/2103.11587.pdf)

\n\n[9] Context-Aware Layout to Image Generation with Enhanced Object Appearance(具有增强的对象外观的上下文感知布局到图像生成)
\n[paper](https://arxiv.org/abs/2103.11897)

\n\n[8] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)
\n[paper](https://arxiv.org/abs/2103.07893)

\n\n[7] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)
\n[paper](https://arxiv.org/abs/2103.06902)

\n\n[6] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)
\n[paper](https://arxiv.org/abs/2103.04023) | [code](https://github.com/Zhangjinso/PISE)

\n\n[5] SMPLicit: Topology-aware Generative Model for Clothed People(穿衣服的人的拓扑感知生成模型)
\n[paper](https://arxiv.org/pdf/2103.06871.pdf) | [code](http://www.iri.upc.edu/people/ecorona/smplicit/)

\n\n[4] Diversifying Sample Generation for Data-Free Quantization（多样化的样本生成，实现无数据量化）
\n[paper](https://arxiv.org/abs/2103.01049)

\n\n[3] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)
\n[paper](https://arxiv.org/abs/2103.06878) | [code](https://github.com/tzt101/INADE.git)

\n\n[2] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时：一个多任务学习框架)
\n[paper](https://arxiv.org/abs/2103.01520) | [code](https://github.com/Hzzone/MTLFace)

\n\n[1] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)
\n[paper](https://arxiv.org/abs/2103.03243) | [code](https://github.com/mit-han-lab/anycost-gan)

\n\n \n\n### 视图合成(View Synthesis)\n\n[7] Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes(用于动态场景时空视图合成的神经场景流场)
\n[paper](https://arxiv.org/abs/2011.13084) | [project](http://www.cs.cornell.edu/~zl548/NSFF/)<>\n\n[6] Stable View Synthesis(稳定的视图合成)
\n[paper](https://arxiv.org/abs/2011.07233) | [code](https://github.com/intel-isl/StableViewSynthesis)

\n\n[5] Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes(立体辐射场（SRF）：学习新颖的场景的稀疏视图的视图合成)
\n[paper](https://arxiv.org/abs/2104.06935) | [project](https://virtualhumans.mpi-inf.mpg.de/srf/)

\n\n[4] Layout-Guided Novel View Synthesis from a Single Indoor Panorama(单一室内全景的布局引导式新颖视图合成)
\n[paper](https://arxiv.org/abs/2103.17022) | [project](https://github.com/bluestyle97/PNVS)

\n\n[3] NeX: Real-time View Synthesis with Neural Basis Expansion(NeX：具有神经基础扩展的实时视图合成)
\n[paper](https://arxiv.org/abs/2103.05606) | [code](https://nex-mpi.github.io/)

\n\n[2] ID-Unet: Iterative Soft and Hard Deformation for View Synthesis(视图合成的迭代软硬变形)
\n[paper](https://arxiv.org/abs/2103.02264)

\n\n[1] Self-Supervised Visibility Learning for Novel View Synthesis(自我监督的可视性学习，用于新颖的视图合成)
\n[paper](https://arxiv.org/abs/2103.15407)

\n\n
\n\n \n\n## 三维视觉(3D Vision)\n\n[8] 3D Spatial Recognition without Spatially Labeled 3D(没有空间标记的3D的3D空间识别)
\n[paper](https://arxiv.org/abs/2105.06461)

\n\n[7] Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations(等值点：使用混合表示优化神经隐式曲面)
\n[paper](https://arxiv.org/abs/2012.06434) | [code](https://github.com/yifita/iso-points)

\n\n[6] Learning Feature Aggregation for Deep 3D Morphable Models(深度3D可变形模型的学习特征聚合)
\n[paper](https://arxiv.org/abs/2105.02173)

\n\n[5] Deep Polarization Imaging for 3D shape and SVBRDF Acquisition(用于3D形状和SVBRDF采集的深偏振成像)
\n[paper](https://arxiv.org/abs/2105.02875)

\n\n[4] Unsupervised 3D Shape Completion through GAN Inversion(通过GAN反演实现无监督3D形状补全)
\n[paper](https://arxiv.org/abs/2104.13366) | [project](https://junzhezhang.github.io/projects/ShapeInversion/)

\n\n[3] KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control(【3D关键点】关键点变形器：用于形状控制的无监督三维关键点发现)
\n[paper](https://arxiv.org/abs/2104.11224) | [project](http://tomasjakab.github.io/KeypointDeformer)

\n\n[2] A Deep Emulator for Secondary Motion of 3D Characters(三维角色二次运动的深度仿真器)\n[paper](https://arxiv.org/abs/2103.01261)\n\n[1] 3D CNNs with Adaptive Temporal Feature Resolutions(具有自适应时间特征分辨率的3D CNN)
\n[paper](https://arxiv.org/abs/2011.08652)

\n\n \n\n### 点云(Point Cloud)\n\n[32] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation([点云分割]用于 LiDAR 分割的圆柱形和非对称 3D 卷积网络)
\n[paper](https://arxiv.org/abs/2011.10033) ｜ [code](https://github.com/xinge008/Cylinder3D)

\n\n[31] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution(DyCo3D：通过动态卷积对 3D 点云进行稳健的实例分割)
\n[paper](https://arxiv.org/abs/2011.13328) | [code](https://git.io/DyCo3D)

\n\n[30] PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds(PV-RAFT：用于点云场景流估计的点体素相关字段)
\n[paper](https://arxiv.org/abs/2012.00987)

\n\n[29] PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization(PWCLO-Net：使用分层嵌入掩码优化的 3D 点云中的深度激光雷达测距)
\n[paper](https://arxiv.org/abs/2012.00972) | [code](https://github.com/IRMVLab/PWCLONet)

\n\n[28] Omni-supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning(通过渐进感受野分量推理的全方位监督点云分割)
\n[paper](https://arxiv.org/abs/2105.10203) | [code](https://github.com/azuki-miho/RFCR)

\n\n[27] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths(PMP-Net：通过学习多步点移动路径来完成点云)
\n[paper](https://arxiv.org/abs/2012.03408)

\n\n[26] VoxelContext-Net: An Octree based Framework for Point Cloud Compression(VoxelContext-Net：基于Octree的点云压缩框架)
\n[paper](https://arxiv.org/abs/2105.02158)

\n\n[25] Variational Relational Point Completion Network(变分关系点完备网络)
\n[paper](https://arxiv.org/abs/2104.10154) | [project](https://paul007pl.github.io/projects/VRCNet.html)

\n\n[24] SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements(规模：使用关节局部元素的表面编解码器模拟穿衣服的人)
\n[paper](https://arxiv.org/abs/2104.07660) | [code](https://qianlim.github.io/SCALE)

\n\n[23] RPSRNet: End-to-End Trainable Rigid Point Set Registration Network using Barnes-Hut 2D-Tree Representation(RPSRNet：使用Barnes-Hut二维树表示法的端到端可训练刚性点集配准网络)
\n[paper](https://arxiv.org/abs/2104.05328)

\n\n[22] View-Guided Point Cloud Completion(视图引导的点云完成)
\n[paper](https://arxiv.org/abs/2104.05666)

\n\n[21] DeepI2P: Image-to-Point Cloud Registration via Deep Classification(通过深度分类的图像到点云配准)
\n[paper](https://arxiv.org/abs/2104.03501) | [code](https://github.com/lijx10/DeepI2P)

\n\n[20] FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds(FESTA：场景点云通过时空注意进行光流估计)
\n[paper](https://arxiv.org/abs/2104.00798)

\n\n[19] Denoise and Contrast for Category Agnostic Shape Completion(类别不可知形状完成的消噪和对比度)
\n[paper](https://arxiv.org/abs/2103.16671)

\n\n[18] Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation(无提案的LiDAR点云全景分割)
\n[paper](https://arxiv.org/pdf/2103.14962.pdf)

\n\n[17] ReAgent: Point Cloud Registration using Imitation and Reinforcement Learning(ReAgent：使用模仿和强化学习进行点云配准)
\n[paper](https://arxiv.org/abs/2103.15231)

\n\n[16] Equivariant Point Network for 3D Point Cloud Analysis(等变点网络进行3D点云分析)
\n[paper](https://arxiv.org/abs/2103.14147)

\n\n[15] PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds(PAConv：点云上具有动态内核组装的位置自适应卷积)
\n[paper](https://arxiv.org/abs/2103.14635) | [code](https://github.com/CVMI-Lab/PAConv)

\n\n[14] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并：无监督的对准关键点检测器)
\n[paper](https://arxiv.org/pdf/2103.10814.pdf) | [code](https://github.com/eliphatfs/SkeletonMerger)

\n\n[13] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding(使用缺失区域编码的循环变换完成不成对的点云)
\n[paper](https://arxiv.org/abs/2103.07838)

\n\n[12] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)
\n[paper](https://arxiv.org/abs/2103.07074)

\n\n[11] How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines(线云如何保护隐私？从3D线中恢复场景详细信息)
\n[paper](https://arxiv.org/pdf/2103.05086.pdf) | [code](https://github.com/kunalchelani/Line2Point)

\n\n[10] PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency(使用深度空间一致性进行稳健的点云配准)
\n[paper](https://arxiv.org/abs/2103.05465) | [code](https://github.com/XuyangBai/PointDSC)

\n\n[9] Robust Point Cloud Registration Framework Based on Deep Graph Matching(基于深度图匹配的鲁棒点云配准框架)
\n[paper](https://arxiv.org/pdf/2103.04256.pdf) | [code](https://github.com/fukexue/RGM)

\n\n[8] TPCN: Temporal Point Cloud Networks for Motion Forecasting(面向运动预测的时态点云网络)\n[paper](https://arxiv.org/abs/2103.03067) | [code]()\n\n[7] PointGuard: Provably Robust 3D Point Cloud Classification(可证明稳健的三维点云分类)
\n[paper](https://arxiv.org/abs/2103.03046)

\n\n[6] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)
\n[paper](https://arxiv.org/abs/2009.03137) | [code](https://github.com/QingyongHu/SensatUrban)

\n\n[5] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration(SpinNet：学习用于3D点云配准的通用表面描述符)
\n[paper](https://arxiv.org/abs/2011.12149) | [code](https://github.com/QingyongHu/SpinNet)

\n\n[4] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)
\n[paper](https://arxiv.org/pdf/2101.06605.pdf) | [code](https://github.com/huangjh-pub/multibody-sync)

\n\n[3] Diffusion Probabilistic Models for 3D Point Cloud Generation(三维点云生成的扩散概率模型)
\n[paper](https://arxiv.org/abs/2103.01458) | [code](https://github.com/luost26/diffusion-point-cloud)

\n\n[2] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion(用于点云补全的对抗性渲染基于样式的点生成器)
\n[paper](https://arxiv.org/abs/2103.02535)\n\n[1] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器：低重叠的3D点云的配准)
\n[paper](https://arxiv.org/pdf/2011.13005.pdf) | [code](https://github.com/ShengyuH/OverlapPredator) | [project](https://overlappredator.github.io/)

\n\n\n \n\n### 三维重建(3D Reconstruction)\n\n[25] 3D-NVS: A 3D Supervision Approach for Next View Selection(3D-NVS：下一个视图选择的 3D 监督方法)
\n[paper](https://arxiv.org/abs/2012.01743)

\n\n[24] Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance(未知通用反射率的无纹理光滑表面的多视图 3D 重建)
\n[paper](https://arxiv.org/abs/2105.11599)

\n\n[23] Sketch2Model: View-Aware 3D Modeling from Single Free-Hand Sketches(Sketch2Model：从单个徒手草图开始的具有视图感知能力的3D建模)
\n[paper](https://arxiv.org/abs/2105.06663)

\n\n[22] From Points to Multi-Object 3D Reconstruction(从点到多对象3D重建)
\n[paper](https://arxiv.org/abs/2012.11575)

\n\n[21] End-to-End Human Pose and Mesh Reconstruction with Transformers(使用Transformer进行端到端的人体姿势和网格重建)
\n[paper](https://arxiv.org/abs/2012.09760)

\n\n[20] DECOR-GAN: 3D Shape Detailization by Conditional Refinement(DECOR-GAN：通过条件细化实现3D形状细化)
\n[paper](https://arxiv.org/abs/2012.09159) | [code](https://github.com/czq142857/DECOR-GAN)

\n\n[19] pixelNeRF: Neural Radiance Fields from One or Few Images(pixelNeRF：一幅或几幅图像的神经辐射场)
\n[paper](https://arxiv.org/abs/2012.02190) | [project](https://alexyu.net/pixelnerf)

\n\n[18] LASR: Learning Articulated Shape Reconstruction from a Monocular Video(LASR：从单眼视频中学习关节形状的重建)
\n[paper](https://arxiv.org/abs/2105.02976) | [code](http://lasr-google.github.io/)

\n\n[17] Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images(重访长方体：学习适合单个RGB图像的稳健3D形状)
\n[paper](https://arxiv.org/abs/2105.02047)

\n\n[16] Multi-person Implicit Reconstruction from a Single Image(从单个图像进行多人隐式重建)
\n[paper](https://arxiv.org/abs/2104.09283)

\n\n[15] CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo(CodedStereo：为大景深立体声而设计的相位掩模)
\n[paper](https://arxiv.org/abs/2104.04641)

\n\n[14] StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision(通过立体视觉进行深度感知的布景人类数字化)
\n[paper](https://arxiv.org/abs/2104.05289) | [projec](http://crishy1995.github.io/StereoPIFuProject)

\n\n[13] Global Transport for Fluid Reconstruction with Learned Self-Supervision(具有自学指导的流体重建的全球运输)
\n[paper](https://arxiv.org/abs/2104.06031) | [code](https://github.com/tum-pbs/Global-Flow-Transport)

\n\n[12] Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction(全面了解通用对象：建模，分段和重构)
\n[paper](https://arxiv.org/abs/2104.00858)

\n\n[11] Reconstructing 3D Human Pose by Watching Humans in the Mirror(通过照镜子中的人来重建3D人的姿势)
\n[paper](https://arxiv.org/abs/2104.00340) | [project](https://zju3dv.github.io/Mirrored-Human/)

\n\n[10] Fostering Generalization in Single-view 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors(通过学习局部和全局形状先验的层次结构，促进单视图3D重构中的泛化)
\n[paper](https://arxiv.org/abs/2104.00476)

\n\n[9] NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video(单目视频的实时相干3D重建)
\n[paper](https://arxiv.org/abs/2104.00681) | [project](https://zju3dv.github.io/neuralrecon/)

\n\n[8] Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction(从时空描述符中学习并行密集对应，以进行有效且鲁棒的4D重建)
\n[paper](https://arxiv.org/abs/2103.16341) | [code](https://github.com/tangjiapeng/LPDC-Net)

\n\n[7] POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture(用于单视图人体体积捕获的姿势引导选择性融合)
\n[paper](https://arxiv.org/abs/2103.15331) | [project](http://www.liuyebin.com/posefusion/posefusion.html)

\n\n[6] Deep Implicit Moving Least-Squares Functions for 3D Reconstruction(用于3D重构的深层隐式移动最小二乘函数)
\n[paper](https://arxiv.org/abs/2103.12266) | [code](https://github.com/Andy97/DeepMLS)

\n\n[5] Model-based 3D Hand Reconstruction via Self-Supervised Learning(通过自我监督学习进行基于模型的3D手重建)
\n[paper](https://arxiv.org/pdf/2103.11703)

\n\n[4] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)
\n[paper](https://arxiv.org/pdf/2103.08204.pdf) | [project](https://qiuyuda.github.io/3DCaricShop/)

\n\n[3] Learning Compositional Representation for 4D Captures with Neural ODE(使用神经ODE学习4D捕捉的合成表示)
\n[paper](https://arxiv.org/pdf/2103.08271.pdf)

\n\n[2] SMPLicit: Topology-aware Generative Model for Clothed People(穿衣服的人的拓扑感知生成模型)
\n[paper](https://arxiv.org/pdf/2103.06871.pdf) | [code](http://www.iri.upc.edu/people/ecorona/smplicit/)

\n\n[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透视作物层的3D姿势的几何感知神经重建）
\n[paper](https://arxiv.org/abs/2011.13607)

\n\n
\n\n \n\n## 模型压缩(Model Compression)\n\n[6] Towards Compact CNNs via Collaborative Compression(通过协同压缩迈向紧凑型CNN)
\n[paper](https://arxiv.org/abs/2105.11228)

\n\n[5] Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation(联合DetNAS：用NAS、剪枝和动态蒸馏升级你的探测器)
\n[paper](https://arxiv.org/abs/2105.12971)

\n\n[4] Skip-Convolutions for Efficient Video Processing(跳过卷积以实现高效的视频处理)
\n[paper](https://arxiv.org/abs/2104.11487)

\n\n[3] Content-Aware GAN Compression(内容感知GAN压缩)
\n[paper](https://arxiv.org/abs/2104.02244)

\n\n[2] Dynamic Slimmable Network(动态可压缩网络)
\n[paper](https://arxiv.org/abs/2103.13258) | [code](https://github.com/changlin31/DS-Net)

\n\n[1] Learning Student Networks in the Wild（一种不需要原始训练数据的模型压缩和加速技术）
\n[paper](https://arxiv.org/pdf/1904.01186.pdf) | [code](https://github.com/huawei-noah/DAFL)
\n解读：[华为诺亚方舟实验室提出无需数据网络压缩技术](https://zhuanlan.zhihu.com/p/81277796)

\n\n \n\n### 知识蒸馏(Knowledge Distillation)\n\n[13] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外检测对象的阳性无标签数据提纯)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Guo_Positive-Unlabeled_Data_Purification_in_the_Wild_for_Object_Detection_CVPR_2021_paper.pdf)

\n\n[12] Wasserstein Contrastive Representation Distillation(Wasserstein对比表示蒸馏)
\n[paper](https://arxiv.org/abs/2012.08674)

\n\n[11] Distilling Knowledge via Knowledge Review(通过知识回顾提炼知识)
\n[paper](https://arxiv.org/abs/2104.09044) | [code](https://github.com/Jia-Research-Lab/ReviewKD)

\n\n[10] 3D-to-2D Distillation for Indoor Scene Parsing(用于室内场景解析的3D到2D蒸馏)
\n[paper](https://arxiv.org/abs/2104.02243)

\n\n[9] Complementary Relation Contrastive Distillation(互补关系对比蒸馏)
\n[paper](https://arxiv.org/abs/2103.16367)

\n\n[8] Distilling Object Detectors via Decoupled Features(通过解耦功能蒸馏物体检测器)
\n[paper](https://arxiv.org/abs/2103.14475) | [code](https://github.com/ggjy/DeFeat.pytorch)

\n\n[7] Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation(通过自学来完善自己：通过自我蒸馏提炼特征)
\n[paper](https://arxiv.org/pdf/2103.08273.pdf) | [code](https://github.com/MingiJi/FRSKD)

\n\n[6] Knowledge Evolution in Neural Networks(神经网络中的知识进化)
\n[paper](https://arxiv.org/pdf/2103.05152.pdf) | [code](https://github.com/ahmdtaha/knowledge_evolution)

\n\n[5] Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning(少班级增量学习的语义感知知识蒸馏)
\n[paper](https://arxiv.org/abs/2103.04059)

\n\n[4] Teachers Do More Than Teach: Compressing Image-to-Image Models(https://arxiv.org/abs/2103.03467)
\n[paper](https://arxiv.org/abs/2103.03467) | [code](https://github.com/snap-research/CAT)

\n\n[3] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)
\n[paper](https://arxiv.org/abs/2103.02340)

\n\n[2] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)
\n[paper](https://arxiv.org/abs/2011.11108)

\n\n[1] Distilling Object Detectors via Decoupled Features（前景背景分离的蒸馏技术）

\n\n\n \n\n### 剪枝(Pruning)\n\n[3] Convolutional Neural Network Pruning with Structural Redundancy Reduction(减少结构冗余的卷积神经网络修剪)
\n[paper](https://arxiv.org/abs/2104.03438)

\n\n[2] Neural Response Interpretation through the Lens of Critical Pathways(关键途径对神经反应的解释)
\n[paper](https://arxiv.org/abs/2103.16886) | [code1](https://github.com/CAMP-eXplain-AI/PathwayGrad) | [code2](https://github.com/CAMP-eXplain-AI/RoarTorch)

\n\n[1] Manifold Regularized Dynamic Network Pruning(流形规则化动态网络剪枝)
\n[paper](https://arxiv.org/pdf/2103.05861.pdf)

\n\n \n\n### 量化(Quantization)\n\n[3] Network Quantization with Element-wise Gradient Scaling(逐元素梯度缩放的网络量化)
\n[paper](https://arxiv.org/abs/2104.00903)

\n\n[2] Zero-shot Adversarial Quantization(零样本对抗量化)
\n[paper](https://arxiv.org/abs/2103.15263) | [code](https://git.io/Jqc0y)

\n\n[1] Learnable Companding Quantization for Accurate Low-bit Neural Networks(精确低位神经网络的可学习压扩量化)
\n[paper](https://arxiv.org/pdf/2103.07156.pdf)

\n\n
\n\n \n\n## 神经网络结构设计(Neural Network Structure Design)\n\n[15] Improving Accuracy of Binary Neural Networks using Unbalanced Activation Distribution(使用不平衡激活分布提高二元神经网络的准确性)
\n[paper](https://arxiv.org/abs/2012.00938)

\n\n[14] Heterogeneous Grid Convolution for Adaptive, Efficient, and Controllable Computation(用于自适应、高效和可控计算的异构网格卷积)
\n[paper](https://arxiv.org/abs/2104.11176)

\n\n[13] AsymmNet: Towards ultralight convolution neural networks using asymmetrical bottlenecks(AsymmNet：利用不对称瓶颈迈向超轻型卷积神经网络)
\n[paper](https://arxiv.org/abs/2104.07770) | [code](https://github.com/Spark001/AsymmNet)

\n\n[12] CondenseNet V2: Sparse Feature Reactivation for Deep Networks(CondenseNet V2：深度网络的稀疏功能重新激活)
\n[paper](https://arxiv.org/abs/2104.04382)

\n\n[11] Convolutional Hough Matching Networks(卷积霍夫匹配网络)
\n[paper](https://arxiv.org/abs/2103.16831)

\n\n[10] Capsule Network is Not More Robust than Convolutional Network(胶囊网络并不比卷积网络更健壮)
\n[paper](https://arxiv.org/abs/2103.15459)

\n\n[9] Diverse Branch Block: Building a Convolution as an Inception-like Unit(多元分支块：将卷积构建为类似初始的单位)
\n[paper](https://arxiv.org/abs/2103.13425) | [code](https://github.com/DingXiaoH/DiverseBranchBlock)

\n\n[8] Scaling Local Self-Attention For Parameter Efficient Visual Backbones(扩展局部自注意力以获得有效的参数视觉主干)
\n[paper](https://arxiv.org/pdf/2103.12731.pdf)

\n\n[7] Fast and Accurate Model Scaling(快速准确的模型缩放)
\n[paper](https://arxiv.org/abs/2103.06877)

\n\n[6] Involution: Inverting the Inherence of Convolution for Visual Recognition(反转卷积的固有性以进行视觉识别)
\n[paper](https://arxiv.org/abs/2103.06255) | [code](https://github.com/d-li14/involution)

\n\n[5] Inception Convolution with Efficient Dilation Search(具有有效膨胀搜索的初始卷积)
\n[paper](https://arxiv.org/pdf/2012.13587.pdf) | [code](https://github.com/yifan123/IC-Conv) | [解读-Inception convolution](https://zhuanlan.zhihu.com/p/354194188)\n

\n\n[4] Coordinate Attention for Efficient Mobile Network Design(协调注意力以实现高效的移动网络设计)
\n[paper](https://arxiv.org/abs/2103.02907)

\n\n[3] Rethinking Channel Dimensions for Efficient Model Design(重新考虑通道尺寸以进行有效的模型设计)
\n[paper](https://arxiv.org/abs/2007.00992) | [code](https://github.com/clovaai/rexnet)

\n\n[2] Inverting the Inherence of Convolution for Visual Recognition（颠倒卷积的固有性以进行视觉识别）

\n\n[1] RepVGG: Making VGG-style ConvNets Great Again
\n[paper](https://arxiv.org/abs/2101.03697) | [code](https://github.com/megvii-model/RepVGG)
\n解读：[RepVGG：极简架构，SOTA性能，让VGG式模型再次伟大](https://zhuanlan.zhihu.com/p/344324470)

\n\n \n\n### Transformer\n\n[2] Transformer Interpretability Beyond Attention Visualization(注意力可视化之外的Transformer可解释性)
\n[paper](https://arxiv.org/pdf/2012.09838.pdf) | [code](https://github.com/hila-chefer/Transformer-Explainability)

\n\n[1] Pre-Trained Image Processing Transformer(底层视觉预训练模型)
\n[paper](https://arxiv.org/pdf/2012.00364.pdf) | [解读-Transformer再下一城！low-level多个任务榜首被占领，北大华为等联合提出预训练模型IPT](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247524529&idx=1&sn=e39e67981b2afd9a5369cc843ddf28fe&chksm=ec1c8d48db6b045e7cf2d37c5633da8d3caf5e53178fe6df0913f2a04bc143fcc2e504d6f9be&token=2008688100&lang=zh_CN#rd)

\n\n \n\n### 图神经网络(GNN)\n\n[3] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts(窥探神经网络的推理：解读结构视觉概念)
\n[paper](https://arxiv.org/abs/2105.00290)

\n\n[2] Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)
\n[paper](https://arxiv.org/pdf/2011.12646.pdf)

\n\n[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)
\n[paper](https://arxiv.org/pdf/2006.10219.pdf)

\n\n \n\n### 神经网络架构搜索(NAS)\n\n[16] FP-NAS: Fast Probabilistic Neural Architecture Search(FP-NAS：快速概率神经架构搜索)
\n[paper](https://arxiv.org/abs/2011.10949)

\n\n[15] ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search(ViPNAS：通过神经架构搜索进行高效的视频姿态估计)
\n[paper](https://arxiv.org/abs/2105.10154)

\n\n[14] BCNet: Searching for Network Width with Bilaterally Coupled Network(BCNet：用双边耦合网络搜索网络宽度)
\n[paper](https://arxiv.org/abs/2105.10533)

\n\n[13] TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search(TransNAS-Bench-101：提高跨任务神经架构搜索的可转移性和通用性)
\n[paper](https://arxiv.org/abs/2105.11871)

\n\n[12] Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation(联合DetNAS：用NAS、剪枝和动态蒸馏升级你的探测器)
\n[paper](https://arxiv.org/abs/2105.12971)

\n\n[11] Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search(具有里程碑意义的正则化：神经体系结构搜索中的排名指导超级网络培训)
\n[paper](https://arxiv.org/abs/2104.05309)

\n\n[10] NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization(具有快速超级网络培训和架构优化的高效神经架构搜索)
\n[paper](https://arxiv.org/abs/2104.00031) | [project](http://netadapt.mit.edu/)

\n\n[9] One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking(通过分流引导的搜索空间缩小实现一站式神经集成结构搜索)
\n[paper](https://arxiv.org/abs/2104.00597) | [code](https://github.com/researchmm/NEAS)

\n\n[8] Dynamic Slimmable Network(动态可压缩网络)
\n[paper](https://arxiv.org/abs/2103.13258) | [code](https://github.com/changlin31/DS-Net)

\n\n[7] Prioritized Architecture Sampling with Monto-Carlo Tree Search(蒙特卡洛树搜索的优先架构采样)
\n[paper](https://arxiv.org/pdf/2103.11922.pdf) | [code](https://github.com/xiusu/NAS-Bench-Macro)

\n\n[6] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator(通过生成进行搜索：带有架构生成器的灵活高效的一键式NAS)
\n[paper](https://arxiv.org/abs/2103.07289) | [code](https://github.com/eric8607242/SGNAS)

\n\n[5] Contrastive Neural Architecture Search with Neural Architecture Comparators(带有神经结构比较器的对比神经网络架构搜索)
\n[paper](https://arxiv.org/abs/2103.05471) | [code](https://github.com/chenyaofo/CTNAS)

\n\n[4] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)
\n[paper](https://arxiv.org/abs/2103.04507) | [code](https://github.com/VDIGPKU/OPANAS)

\n\n[3] AttentiveNAS: Improving Neural Architecture Search via Attentive(通过注意力改善神经架构搜索)
\n[paper](https://arxiv.org/pdf/2011.09011.pdf)

\n\n[2] ReNAS: Relativistic Evaluation of Neural Architecture Search(NAS predictor当中ranking loss的重要性)
\n[paper](https://arxiv.org/pdf/1910.01523.pdf)

\n\n[1] HourNAS: Extremely Fast Neural Architecture（降低NAS的成本）
\n[paper](https://arxiv.org/pdf/2005.14446.pdf)

\n\n\n
\n\n \n\n## 数据处理(Data Processing)\n\n \n\n### 数据增广(Data Augmentation)\n\n[3] A Fourier-based Framework for Domain Generalization(基于傅立叶的域泛化框架)
\n[paper](https://arxiv.org/abs/2105.11120)

\n\n[2] AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation(通过可扩展的概率隐式微分对带有标签噪声的有偏数据进行鲁棒的自动增强)
\n[paper](https://arxiv.org/abs/2103.05863)

\n\n[1] KeepAugment: A Simple Information-Preserving Data Augmentation(一种简单的保存信息的数据扩充)
\n[paper](https://arxiv.org/pdf/2011.11778.pdf)

\n\n \n\n### 表征学习(Representation Learning)\n\n[19] Task Programming: Learning Data Efficient Behavior Representations(任务编程：学习数据高效行为表征)
\n[paper](https://arxiv.org/abs/2011.13917) ｜ [code](https://github.com/neuroethology/TREBA) | [project](https://sites.google.com/view/task-programming)

\n\n[18] Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization(通过对比交叉视图互信息最大化学习视图解开人体姿势表示)
\n[paper](https://arxiv.org/abs/2012.01405) | [code](https://github.com/google-research/google-research/tree/master/poem)

\n\n[17] SSAN: Separable Self-Attention Network for Video Representation Learning(SSAN：用于视频表示学习的可分离自注意力网络)
\n[paper](https://arxiv.org/abs/2105.13033)

\n\n[16] Generative Interventions for Causal Learning(因果学习的生成性干预)
\n[paper](https://arxiv.org/abs/2012.12265)

\n\n[15] Representation Learning via Global Temporal Alignment and Cycle-Consistency(通过全局时间对齐和周期一致性进行表示学习)
\n[paper](https://arxiv.org/abs/2105.05217)

\n\n[14] Multi-Perspective LSTM for Joint Visual Representation Learning(用于联合视觉表示学习的多视角LSTM)
\n[paper](https://arxiv.org/abs/2105.02802) | [code](https://github.com/arsm/MPLSTM)

\n\n[13] Unsupervised Visual Representation Learning by Tracking Patches in Video(通过跟踪视频中的补丁来进行无监督的视觉表示学习)
\n[paper](https://arxiv.org/abs/2105.02545) | [code](http://github.com/microsoft/CtP)

\n\n[12] A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning(无监督时空表示学习的大规模研究)
\n[paper](https://arxiv.org/abs/2104.14558)

\n\n[11] Where and What? Examining Interpretable Disentangled Representations(在哪里和什么？检查可解释的纠缠表示)
\n[paper](https://arxiv.org/abs/2104.05622)

\n\n[10] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning(眼见为实：视觉语言表示学习的端到端预训练)
\n[paper](https://arxiv.org/abs/2104.03135)

\n\n[9] Self-supervised Video Representation Learning by Context and Motion Decoupling(通过上下文和运动解耦进行自我监督的视频表示学习)
\n[paper](https://arxiv.org/abs/2104.00862)

\n\n[8] Jigsaw Clustering for Unsupervised Visual Representation Learning(拼图聚类的无监督视觉表示学习)
\n[paper](https://arxiv.org/abs/2104.00323) | [code](https://github.com/Jia-Research-Lab/JigsawClustering)

\n\n[7] Learning by Aligning Videos in Time(【视频表征】通过时间对齐视频进行学习)
\n[paper](https://arxiv.org/abs/2103.17260)

\n\n[6] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting(矢量化和光栅化：素描和手写的自我指导学习)
\n[paper](https://arxiv.org/abs/2103.13716) | [code](https://github.com/AyanKumarBhunia/Self-Supervised-Learning-for-Sketch)

\n\n[5] Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks(神经零件：使用可逆神经网络学习富有表现力的3D形状提取)
\n[paper](https://arxiv.org/pdf/2103.10429.pdf)

\n\n[4] VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples(对比视频表示学习和临时对抗示例)
\n[paper](https://arxiv.org/abs/2103.05905)

\n\n[3] Spatially Consistent Representation Learning(空间一致表示学习)
\n[paper](https://arxiv.org/abs/2103.06122)

\n\n[2] Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning(通过添加背景来删除背景：朝着背景进行鲁棒的自我监督视频表示学习)
\n[paper](https://arxiv.org/pdf/2009.05769.pdf) | [code](https://github.com/FingerRec/BE) | [project](https://fingerrec.github.io/index_files/jinpeng/papers/CVPR2021/project_website.html) | [解读](https://zhuanlan.zhihu.com/p/355232006)

\n\n[1] VirTex: Learning Visual Representations from Textual Annotations（从文本注释中学习视觉表示）
\n[paper](https://arxiv.org/abs/2006.06666) | [code](https://github.com/kdexd/virtex)

\n\n \n\n### 归一化/正则化(Batch Normalization)\n\n[3] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning(半监督转移学习的自适应一致性正则化)
\n[paper](https://arxiv.org/abs/2103.02193) | [code](https://github.com/SHI-Labs/Semi-Supervised-Transfer-Learning)\n\n[2] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)
\n[paper](https://arxiv.org/abs/2011.14670)

\n\n[1] Representative Batch Normalization with Feature Calibration（具有特征校准功能的代表性批量归一化）

\n\n \n\n### 图像聚类(Image Clustering)\n\n[4] Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes(具有10^7个节点的大规模图上的结构感知人脸聚类)
\n[paper](https://arxiv.org/abs/2103.13225) | [code&project](https://sstzal.github.io/STAR-FC/)

\n\n[3] COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction(通过对比预测的不完整多视图聚类)
\n[paper](http://pengxi.me/wp-content/uploads/2021/03/2021CVPR-completer.pdf) | [code](https://github.com/XLearning-SCU/2021-CVPR-Completer)

\n\n[2] Improving Unsupervised Image Clustering With Robust Learning（通过鲁棒学习改善无监督图像聚类）
\n[paper](https://arxiv.org/abs/2012.11150) | [code](https://github.com/deu30303/RUC)

\n\n[1] Reconsidering Representation Alignment for Multi-view Clustering(重新考虑多视图聚类的表示对齐方式)
\n[paper](https://arxiv.org/abs/2103.07738) | [code](https://github.com/DanielTrosten/mvc)

\n\n\n \n\n### 图像压缩(Image Compression)\n\n[4] Learning Scalable ℓ∞-constrained Near-lossless Image Compression via Joint Lossy Image and Residual Compression(通过联合有损图像和残差压缩学习可伸缩ℓ∞约束的近无损图像压缩)
\n[paper](https://arxiv.org/abs/2103.17015) | [code](https://github.com/BYchao100/Scalable-Near-lossless-Image-Compression)

\n\n[3] Checkerboard Context Model for Efficient Learned Image Compression(高效学习图像压缩的棋盘上下文模型)
\n[paper](https://arxiv.org/abs/2103.15306)

\n\n[2] Slimmable Compressive Autoencoders for Practical Neural Image Compression(实用神经图像压缩的可压缩压缩自动编码器)
\n[paper](https://arxiv.org/abs/2103.15726)

\n\n[1] Attention-guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton(通过压缩感知显着性骨架的深度重构来进行注意力引导的图像压缩)
\n[paper](https://arxiv.org/abs/2103.15368)

\n\n \n\n### 异常检测(Anomaly Detection)\n\n[3] MOS: Towards Scaling Out-of-distribution Detection for Large Semantic Space(MOS：面向大型语义空间的规模化异常样本检测)
\n[paper](https://arxiv.org/abs/2105.01879)

\n\n[2] MOOD: Multi-level Out-of-distribution Detection(MOOD：多级异常样本检测)
\n[paper](https://arxiv.org/abs/2104.14726)

\n\n[1] Learning Placeholders for Open-Set Recognition(学习占位符以进行开放式识别)
\n[paper](https://arxiv.org/abs/2103.15086)

\n\n\n
\n\n \n\n## 模型训练/泛化(Model Training/Generalization)\n\n[8] A Bop and Beyond: A Second Order Optimizer for Binarized Neural Networks(【优化算法】Bop和超越：二值神经网络的二阶优化器)
\n[paper](https://arxiv.org/abs/2104.05124)

\n\n[7] Simpler Certified Radius Maximization by Propagating Covariances(通过传播协方差简化认证半径最大化)
\n[paper](https://arxiv.org/abs/2104.05888) | [video](https://youtu.be/m1ya2oNf5iE)

\n\n[6] Differentiable Patch Selection for Image Recognition(用于图像识别的差异化补丁选择)
\n[paper](https://arxiv.org/abs/2104.03059) | [code](https://github.com/google-research/google-research/tree/master/ptopk_patch_selection/)

\n\n[5] Towards Evaluating and Training Verifiably Robust Neural Networks(评估和训练可验证的稳健神经网络)
\n[paper](https://arxiv.org/abs/2104.00447) | [code](https://github.com/ZhaoyangLyu/VerifiablyRobustNN)

\n\n[4] Student-Teacher Learning from Clean Inputs to Noisy Inputs(从纯净输入到噪音输入的师生学习)
\n[paper](https://arxiv.org/pdf/2103.07600.pdf)

\n\n[3] Uncertainty-guided Model Generalization to Unseen Domains(不确定性指导的模型泛化)
\n[paper](https://arxiv.org/abs/2103.07531)

\n\n[2] Knowledge Evolution in Neural Networks(神经网络中的知识进化)
\n[paper](https://arxiv.org/pdf/2103.05152.pdf) | [code](https://github.com/ahmdtaha/knowledge_evolution)

\n\n[1] PGT: A Progressive Method for Training Models on Long Videos(一种在长视频上训练模型的渐进方法)
\n[paper](https://arxiv.org/pdf/2103.11313.pdf) | [code](https://github.com/BoPang1996/PGT)

\n\n \n\n### 噪声标签(Noisy Label)\n\n[3] Correlated Input-Dependent Label Noise in Large-Scale Image Classification(大规模图像分类中的关联输入相关标签噪声)
\n[paper](https://arxiv.org/abs/2105.10305)

\n\n[2] A Second-Order Approach to Learning with Instance-Dependent Label Noise(与实例相关的标签噪声的二阶学习方法)
\n[paper](https://arxiv.org/abs/2012.11854) | [code](https://github.com/UCSC-REAL/CAL)

\n\n[1] Partially View-aligned Representation Learning with Noise-robust Contrastive Loss(面向部分视图对齐表示学习的噪声鲁棒对比损失函数)
\n[paper](http://pengxi.me/wp-content/uploads/2021/03/2021CVPR-MvCLNwith-supp.pdf) | [code](https://github.com/XLearning-SCU/2021-CVPR-MvCLN)

\n\n \n\n### 长尾分布(Long-Tailed Distribution)\n\n[8] Disentangling Label Distribution for Long-tailed Visual Recognition(用于长尾视觉识别的解开标签分布)
\n[paper](https://arxiv.org/abs/2012.00321)

\n\n[7] Adversarial Robustness under Long-Tailed Distribution(长尾分布下的对抗鲁棒性)
\n[paper](https://arxiv.org/abs/2104.02703) | [code](https://github.com/wutong16/Adversarial_Long-Tail)

\n\n[6] Adaptive Class Suppression Loss for Long-Tail Object Detection(长尾目标检测的自适应类抑制损失)
\n[paper](https://arxiv.org/abs/2104.00885) | [code](https://github.com/CASIA-IVA-Lab/ACSL)

\n\n[5] Improving Calibration for Long-Tailed Recognition(改善长尾识别的校准)
\n[paper](https://arxiv.org/abs/2104.00466) | [code](https://github.com/Jia-Research-Lab/MiSLAS)

\n\n[4] Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification(基于对比学习的混合网络的长尾图像分类)
\n[paper](https://arxiv.org/abs/2103.14267)

\n\n[3] PML: Progressive Margin Loss for Long-tailed Age Classification(长尾年龄分类的累进边际损失)
\n[paper](https://arxiv.org/abs/2103.02140)

\n\n[2] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition(MetaSAug：用于长尾视觉识别的元语义增强)
\n[paper](https://arxiv.org/pdf/2103.12579.pdf)

\n\n[1] Distribution Alignment: A Unified Framework for Long-tail Visual Recognition(分布对齐：长尾视觉识别的统一框架)
\n[paper](https://arxiv.org/abs/2103.16370) | [code](https://github.com/Megvii-BaseDetection/DisAlign)

\n
\n\n \n\n## 模型评估(Model Evaluation)\n\n[1] Are Labels Necessary for Classifier Accuracy Evaluation?(测试集没有标签，我们可以拿来测试模型吗？)
\n[paper](https://arxiv.org/abs/2007.02915) | [解读](https://zhuanlan.zhihu.com/p/328686799)

\n\n
\n\n \n\n## 多模态学习(Multi-Modal Learning)\n\n[8] Distilling Audio-Visual Knowledge by Compositional Contrastive Learning(运用组合对比学习提取视听知识)
\n[paper](https://arxiv.org/abs/2104.10955) | [code](https://github.com/yanbeic/CCL)

\n\n[7] Cross-Modal Center Loss for 3D Cross-Modal Retrieval(用于3D跨模态检索的跨模态中心损失)
\n[paper](https://arxiv.org/abs/2008.03561) | [code](https://github.com/LongLong-Jing/Cross-Modal-Center-Loss)

\n\n[6] Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion(具有深度敏感注意力和自动多模态融合的深度RGB-D显著性检测)
\n[paper](https://arxiv.org/abs/2103.11832)

\n\n[5] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge(多模态知识提取的自监督多目标检测与有声跟踪)
\n[paper](https://arxiv.org/abs/2103.01353) | [video](https://www.youtube.com/channel/UCRpM8k1GY3kD2TqCo_yKN3g) | [project](http://rl.uni-freiburg.de/research/multimodal-distill)

\n\n[4] Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion(具有深度敏感注意力和自动多模态融合的深度RGB-D显著性检测)
\n[paper](https://arxiv.org/abs/2103.11832)

\n\n[3] LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents(动态代理的多模态未来轨迹的车道感知预测)
\n[paper](https://arxiv.org/abs/2104.00249)

\n\n[2] Multimodal Motion Prediction with Stacked Transformers(堆叠式Transformer的多模态运动预测)
\n[paper](https://arxiv.org/pdf/2103.11624.pdf) | [code](https://decisionforce.github.io/mmTransformer)

\n\n[1] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving(用于端到端自动驾驶的多模态融合Transformer)
\n[paper](https://arxiv.org/abs/2104.09224)

\n\n \n\n### 视听学习(Audio-visual Learning)\n\n[7] Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions(语音时刻：从视频描述中学习联合视听表示)
\n[paper](https://arxiv.org/abs/2105.04489)

\n\n[6] Visually Informed Binaural Audio Generation without Binaural Audios(无需双耳音频的可视化双耳音频生成)
\n[paper](https://arxiv.org/abs/2104.06162) | [project](https://sheldontsui.github.io/projects/PseudoBinaural)

\n\n[5] Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation(观察您的语音：学习跨模态亲和力以进行视听语音分离)
\n[paper](https://arxiv.org/abs/2104.02775) | [project](https://caffnet.github.io/)

\n\n[4] Localizing Visual Sounds the Hard Way(视觉声音定位的困难方法)
\n[paper](https://arxiv.org/abs/2104.02691)

\n\n[3] Can audio-visual integration strengthen robustness under multimodal attacks?(视听集成能否增强多模式攻击下的鲁棒性？)
\n[paper](https://arxiv.org/abs/2104.02000)

\n\n[2] Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation(探测对象视觉接地与声音分离的循环共同学习)
\n[paper](https://arxiv.org/abs/2104.02026)

\n\n[1] Positive Sample Propagation along the Audio-Visual Event Line(沿视听事件线的正样本传播)
\n[paper](https://arxiv.org/abs/2104.00239) | [code](https://github.com/jasongief/PSP_CVPR_2021)

\n\n\n
\n \n\n## 视觉预测(Vision-based Prediction)\n\n[12] Shared Cross-Modal Trajectory Prediction for Autonomous Driving(自动驾驶的共享跨模态轨迹预测)
\n[paper](https://arxiv.org/abs/2011.08436)

\n\n[11] We are More than Our Joints: Predicting how 3D Bodies Move()
\n[paper](https://arxiv.org/abs/2012.00619) ｜ [code](https://github.com/yz-cnsdqz/MOJO-release) | [project](https://yz-cnsdqz.github.io/MOJO/MOJO.html)

\n\n[10] Interpretable Social Anchors for Human Trajectory Forecasting in Crowds(人群中人类轨迹预测的可解释社会锚点)
\n[paper](https://arxiv.org/abs/2105.03136)

\n\n[9] DriveGAN: Towards a Controllable High-Quality Neural Simulation(DriveGAN：迈向可控的高质量神经仿真)
\n[paper](https://arxiv.org/abs/2104.15060)

\n\n[8] Learning Semantic-Aware Dynamics for Video Prediction(视频预测中的语义感知动态学习)
\n[paper](https://arxiv.org/abs/2104.09762)

\n\n[7] Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction(车道感知不同轨迹预测的分而治之)
\n[paper](https://arxiv.org/abs/2104.08277)

\n\n[6] GATSBI: Generative Agent-centric Spatio-temporal Object Interaction(GATSBI：以生成代理为中心的时空对象交互)
\n[paper](https://arxiv.org/abs/2104.04275)

\n\n[5] SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction(SGCN：行人轨迹预测的稀疏图卷积网络)
\n[paper](https://arxiv.org/abs/2104.01528)

\n\n[4] LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents(动态代理的多模态未来轨迹的车道感知预测)
\n[paper](https://arxiv.org/abs/2104.00249)

\n\n[3] Multimodal Motion Prediction with Stacked Transformers(堆叠式Transformer的多模态运动预测)
\n[paper](https://arxiv.org/pdf/2103.11624.pdf) | [code](https://decisionforce.github.io/mmTransformer)

\n\n[2] Video Prediction Recalling Long-term Motion Context via Memory Alignment Learning(通过记忆对准学习的视频预测调用长期运动环境)
\n[paper](https://arxiv.org/abs/2104.00924)

\n\n[1] MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions(针对复杂时空运动的通用视频预测模型)
\n[paper](https://arxiv.org/abs/2103.02243) | [解读](https://zhuanlan.zhihu.com/p/355703957)

\n\n\n
\n\n \n\n## 数据集(Dataset)\n\n[18] VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval(VIGOR：超越一对一检索的交叉视图图像地理定位)
\n[paper](https://arxiv.org/abs/2011.12172) | [dataset&project](https://github.com/Jeff-Zilence/VIGOR)

\n\n[17] Multi-shot Temporal Event Localization: a Benchmark(多镜头时间事件本地化：基准)
\n[paper](https://arxiv.org/abs/2012.09434) | [dataset&project](https://songbai.site/muses/)

\n\n[16] Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark(检测，跟踪和计数遇到人群中的无人机：基准)
\n[paper](https://arxiv.org/abs/2105.02440) | [dataset&code](https://github.com/VisDrone/DroneCrowd)

\n\n[15] AGORA: Avatars in Geography Optimized for Regression Analysis(AGORA：针对回归分析进行了优化的地理头像)
\n[paper](https://arxiv.org/abs/2104.14643) | [project](https://agora.is.tue.mpg.de)

\n\n[14] Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets(【数据集标注】寻求有效注释大型图像分类数据集的良好做法)
\n[paper](https://arxiv.org/abs/2104.12690) | [project](https://fidler-lab.github.io/efficient-annotation-cookbook)

\n\n[13] Learning To Count Everything(【视觉计数】学习计算一切)
\n[paper](https://arxiv.org/abs/2104.08391) | [dataset&code](https://github.com/cvlab-stonybrook/LearningToCountEverything)

\n\n[12] DexYCB: A Benchmark for Capturing Hand Grasping of Objects(DexYCB：捕获对象的手抓握的基准)
\n[paper](https://arxiv.org/abs/2104.04631) |[dataset&code](https://dex-ycb.github.io/)\n\n[11] The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions(多智能体行为数据集：鼠标二元社交互动)
\n[paper](https://arxiv.org/abs/2104.02710) | [dataset](https://www.aicrowd.com/challenges/multi-agent-behavior-representation-modeling-measurement-and-applications)

\n\n[10] Deep Animation Video Interpolation in the Wild(野外深度动画视频插帧)
\n[paper](https://arxiv.org/abs/2104.02495) | [dataset&code](https://github.com/lisiyao21/AnimeInterp/)

\n\n[9] Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes(在动态场景中实现卷帘快门校正和去模糊)
\n[paper](https://arxiv.org/abs/2104.01601) | [dataset&code](https://github.com/zzh-tech/RSCD)

\n\n[8] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles(无人机-人类：了解无人机行为的大型基准)
\n[paper](https://arxiv.org/abs/2104.00946)

\n\n[7] Visual Semantic Role Labeling for Video Understanding(【视频理解】用于视频理解的视觉语义角色标签)
\n[paper](https://arxiv.org/abs/2104.00990) | [dataset&code](http://vidsitu.org/)

\n\n[6] Face Forensics in the Wild(人脸伪造数据集)
\n[paper](https://arxiv.org/abs/2103.16076) | [dataset&code](https://github.com/tfzhou/FFIW)

\n\n[5] Benchmarking Representation Learning for Natural World Image Collections(【自然图像分类】自然世界影像收藏的基准表示学习)
\n[paper](https://arxiv.org/abs/2103.16483)

\n\n[4] Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark(多标签下水道缺陷分类数据集和基准)
\n[paper](https://arxiv.org/abs/2103.10895) | [project&dataset](https://vap.aau.dk/sewer-ml/)

\n\n[3] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)
\n[paper](https://arxiv.org/pdf/2103.08204.pdf) | [project](https://qiuyuda.github.io/3DCaricShop/)

\n\n[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)
\n[paper](https://arxiv.org/abs/2009.03137) | [code](https://github.com/QingyongHu/SensatUrban)

\n\n[1] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels（重新标记ImageNet：从单标签到多标签，从全局标签到本地标签）
\n[paper](https://arxiv.org/abs/2101.05022) | [code](https://github.com/naver-ai/relabel_imagenet)

\n\n
\n\n \n\n## 主动学习(Active Learning)\n\n\n[3] Vab-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
\n[paper](https://arxiv.org/abs/2003.11249)

\n\n[2] Multiple Instance Active Learning for Object Detection（用于对象检测的多实例主动学习）
\n[paper](https://github.com/yuantn/MIAL/raw/master/paper.pdf) | [code](https://github.com/yuantn/MIAL)

\n\n[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)
\n[paper](https://arxiv.org/pdf/2006.10219.pdf)

\n\n
\n\n \n\n## 小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)\n\n[11] Few-Shot Classification with Feature Map Reconstruction Networks(使用特征映射重建网络的少样本分类)
\n[paper](https://arxiv.org/abs/2012.01506) | [code](https://github.com/Tsingularity/FRN)

\n\n[10] Learning Graph Embeddings for Compositional Zero-shot Learning(组成零样本学习的学习图嵌入)
\n[paper](https://arxiv.org/abs/2102.01987) | [code](https://github.com/ExplainableML/czsl)

\n\n[9] Self-Guided and Cross-Guided Learning for Few-Shot Segmentation(自我指导和交叉指导学习，用于小样本分割)
\n[paper](https://arxiv.org/abs/2103.16129)

\n\n[8] Contrastive Embedding for Generalized Zero-Shot Learning(广义零样本学习的对比嵌入)
\n[paper](https://arxiv.org/abs/2103.16173) | [code](https://github.com/Hanzy1996/CE-GZSL)

\n\n[7] Learning Dynamic Alignment via Meta-filter for Few-shot Learning(通过元过滤器学习动态对齐，以进行小样本学习)
\n[paper](https://arxiv.org/abs/2103.13582)

\n\n[6] Goal-Oriented Gaze Estimation for Zero-Shot Learning(零样本学习的目标导向注视估计)
\n[paper](https://arxiv.org/abs/2103.03433) | [code](https://github.com/osierboy/GEM-ZSL)

\n\n[5] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
\n[paper](https://arxiv.org/abs/2012.06166) | [code](https://github.com/mboudiaf/RePRI-for-Few-Shot-Segmentation)

\n\n[4] Counterfactual Zero-Shot and Open-Set Visual Recognition(反事实零样本和开集视觉识别)
\n[paper](https://arxiv.org/abs/2103.00887) | [code](https://github.com/yue-zhongqi/gcm-cf)

\n\n[3] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)
\n[paper](https://arxiv.org/abs/2103.01903)

\n\n[2] Few-shot Open-set Recognition by Transformation Consistency(转换一致性的小样本开放集识别)

\n\n[1] Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning(探索小样本学习的不变表示形式和等变表示形式的互补强度)
\n[paper](https://arxiv.org/abs/2103.01315)

\n\n
\n\n \n\n## 持续学习(Continual Learning/Life-long Learning)\n\n[5] Rectification-based Knowledge Retention for Continual Learning(基于矫正的知识保留用于持续学习)
\n[paper](https://arxiv.org/abs/2103.16597)

\n\n[4] Rainbow Memory: Continual Learning with a Memory of Diverse Samples(彩虹记忆：持续学习与多种样本的记忆)
\n[paper](https://arxiv.org/abs/2103.17230) | [code](https://github.com/clovaai/rainbow-memory)

\n\n[3] Efficient Feature Transformations for Discriminative and Generative Continual Learning(区分性和生成性持续学习的有效特征转换)
\n[paper](https://arxiv.org/abs/2103.13558)

\n\n[2] Rainbow Memory: Continual Learning with a Memory of Diverse Samples（不断学习与多样本的记忆）

\n\n[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)
\n[paper](https://arxiv.org/pdf/2103.10681.pdf)

\n\n
\n\n \n\n## 场景图(Scene Graph)\n\n \n\n### 场景图生成(Scene Graph Generation)\n\n[4] Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation(具有自适应消息传递功能的二分图网络，用于无偏场景图的生成)
\n[paper](https://arxiv.org/abs/2104.00308)

\n\n[3] Fully Convolutional Scene Graph Generation(全卷积场景图生成)
\n[paper](https://arxiv.org/abs/2103.16083)

\n\n[2] Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation(场景图生成的语义歧义概率建模)
\n[paper](https://arxiv.org/abs/2103.05271)

\n\n[1] Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis(利用基于边缘的推理进行基于3D点的场景图分析)
\n[paper](https://arxiv.org/pdf/2103.05558.pdf)

\n\n \n\n### 场景图预测(Scene Graph Prediction)\n\n[1] SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences(基于RGB-D序列的增量3D场景图预测)
\n[paper](https://arxiv.org/abs/2103.14898)

\n\n \n\n### 场景图理解(Scene Graph Understanding)\n\n[4] Semantic Scene Completion via Integrating Instances and Scene in-the-Loop(通过集成实例和场景在环来完成语义场景)
\n[paper](https://arxiv.org/abs/2104.03640) | [code](https://github.com/yjcaimeow/SISNet)

\n\n[3] 3D-to-2D Distillation for Indoor Scene Parsing(用于室内场景解析的3D到2D蒸馏)
\n[paper](https://arxiv.org/abs/2104.02243)

\n\n[2] Bidirectional Projection Network for Cross Dimension Scene Understanding(双向投影网络，用于跨维度场景理解)
\n[paper](https://arxiv.org/abs/2103.14326) | [code](https://github.com/wbhu/BPNet)

\n\n[1] Monte Carlo Scene Search for 3D Scene Understanding(蒙特卡洛场景搜索以了解3D场景)
\n[paper](https://arxiv.org/pdf/2103.07969.pdf)

\n\n

\n\n \n\n## 视觉定位(Visual Localization)\n\n[1] LoFTR: Detector-Free Local Feature Matching with Transformers(【图像特征匹配】LoFTR：与变压器互不影响的无检测器局部特征)
\n[paper](https://arxiv.org/abs/2104.00680) | [project](https://zju3dv.github.io/loftr/)

\n\n\n

\n\n \n\n## 视觉推理/视觉问答(Visual Reasoning/VQA)\n\n[9] Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules(找到了我的理由？使用胶囊进行弱监督的地面视觉问答)
\n[paper](https://arxiv.org/abs/2105.04836)

\n\n[8] Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering(通往答案的桥梁：用于视频问答的结构感知图交互网络)
\n[paper](https://arxiv.org/abs/2104.14085)

\n\n[7] PQA: Perceptual Question Answering(感性问题解答)
\n[paper](https://arxiv.org/abs/2104.03589)

\n\n[6] Domain-robust VQA with diverse datasets and methods but no target labels(具有各种数据集和方法，但没有目标标签的领域稳健的VQA)
\n[paper](https://arxiv.org/abs/2103.15974)

\n\n[5] AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning(AGQA：组成时空推理的基准)
\n[paper](https://arxiv.org/abs/2103.16002)

\n\n[4] Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution(通过概率绑架和执行进行抽象时空推理)\n[paper](https://arxiv.org/abs/2103.14230) | [project](http://wellyzhang.github.io/project/prae.html) | [supplementary](http://wellyzhang.github.io/attach/cvpr21zhang_prae_supp.pdf)

\n\n[3] ACRE: Abstract Causal REasoning Beyond Covariation(ACRE：超越协方差的抽象因果推理)
\n[paper](https://arxiv.org/abs/2103.14232) | [project](http://wellyzhang.github.io/project/acre.html) | [Supplementary](http://wellyzhang.github.io/attach/cvpr21zhang_acre_supp.pdf)

\n\n[2] TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events(问题解答基准和有效的交通事件视频推理网络)
\n[paper](https://arxiv.org/abs/2103.15538) | [project](https://github.com/SUTDCV/SUTD-TrafficQA)
\n\n[1] Transformation Driven Visual Reasoning(转型驱动的视觉推理)
\n[paper](https://arxiv.org/pdf/2011.13160.pdf) | [code](https://github.com/hughplay/TVR) | [project](https://hongxin2019.github.io/TVR/)
\n\n\n

\n\n \n\n## 图像分类(Image Classification)\n\n[5] Benchmarking Representation Learning for Natural World Image Collections(自然世界影像收藏的基准表示学习)
\n[paper](https://arxiv.org/abs/2103.16483)

\n\n[4] Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark(多标签下水道缺陷分类数据集和基准)
\n[paper](https://arxiv.org/abs/2103.10895) | [project&dataset](https://vap.aau.dk/sewer-ml/)

\n\n[3] Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification(基于对比学习的混合网络的长尾图像分类)
\n[paper](https://arxiv.org/abs/2103.14267)

\n\n[2] PML: Progressive Margin Loss for Long-tailed Age Classification(长尾年龄分类的累进边际损失)
\n[paper](https://arxiv.org/abs/2103.02140)

\n\n[1] A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification(细粒度分类的半监督学习的现实评估)
\n[paper](https://arxiv.org/abs/2104.00679)

\n\n\n

\n\n \n\n## 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)\n\n[24] FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation(FixBi：无监督域适应的桥接域空间)
\n[paper](https://arxiv.org/abs/2011.09230)

\n\n[23] Ranking Neural Checkpoints(对神经检查点进行排名)
\n[paper](https://arxiv.org/abs/2011.11200)

\n\n[22] How Well Do Self-Supervised Models Transfer?(自监督模型的迁移效果如何？)
\n[paper](https://arxiv.org/abs/2011.13377) | [code](https://github.com/linusericsson/ssl-transfer)

\n\n[21] Adversarially Adaptive Normalization for Single Domain Generalization(单域泛化的对抗性自适应归一化)
\n[paper](https://arxiv.org/abs/2106.01899)

\n\n[20] Visualizing Adapted Knowledge in Domain Transfer(领域转移中适应性知识的可视化)
\n[paper](https://arxiv.org/abs/2104.10602) | [code](https://github.com/hou-yz/DA_visualization)

\n\n[19] Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation(基于实例级亲和力的无监督域自适应传输)
\n[paper](https://arxiv.org/abs/2104.01286) | [code](https://github.com/astuti/ILA-DA)

\n\n[18] Unsupervised Multi-source Domain Adaptation Without Access to Source Data(无需访问源数据的无监督多源域适配)
\n[paper](https://arxiv.org/abs/2104.01845)

\n\n[17] Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation(多目标领域适应的课程图协同教学)
\n[paper](https://arxiv.org/abs/2104.00808)

\n\n[16] Divergence Optimization for Noisy Universal Domain Adaptation(噪声通用域自适应的发散优化)
\n[paper](https://arxiv.org/abs/2104.00246)

\n\n[15] Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation(典型的跨域自我监督学习，适用于少拍无监督领域自适应)
\n[paper](https://arxiv.org/abs/2103.16765) | [project](http://xyue.io/pcs-fuda/index.html)

\n\n[14] Progressive Domain Expansion Network for Single Domain Generalization(用于单域泛化的渐进域扩展网络)
\n[paper](https://arxiv.org/abs/2103.16050)

\n\n[13] Dynamic Domain Adaptation for Efficient Inference(动态域自适应以实现高效推理)
\n[paper](https://arxiv.org/abs/2103.16403)

\n\n[12] Adaptive Methods for Real-World Domain Generalization(真实世界域自适应的自适应方法)
\n[paper](https://arxiv.org/abs/2103.15796)

\n\n[11] OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations(跨域跨任务表示的可传递性度量标准)
\n[paper](https://arxiv.org/abs/2103.13843)

\n\n[10] DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation(分解表示和自适应网络以实现无监督的跨域自适应)
\n[paper](https://arxiv.org/abs/2103.13447)

\n\n[9] MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation(无监督域自适应的协调域对齐和分类)
\n[paper](https://arxiv.org/pdf/2103.13575.pdf)

\n\n[8] Transferable Semantic Augmentation for Domain Adaptation(可转移的语义增强以适应领域)
\n[paper](https://arxiv.org/abs/2103.12562) | [code](https://github.com/BIT-DA/TSA)

\n\n[7] Dynamic Transfer for Multi-Source Domain Adaptation(多源域自适应的动态传输)
\n[paper](https://arxiv.org/abs/2103.10583)

\n\n[6] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)
\n[paper](https://arxiv.org/pdf/2103.04705.pdf)

\n\n[5] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)
\n[paper](https://arxiv.org/abs/2103.04717)

\n\n[4] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)
\n[paper](https://arxiv.org/abs/2012.04324)

\n\n[3] Domain Generalization via Inference-time Label-Preserving Target Projections(基于推理时间保标目标投影的区域泛化)
\n[paper](https://arxiv.org/abs/2103.01134)

\n\n[2] MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing(可伸缩的自适应视频压缩传感重建)
\n[paper](https://arxiv.org/abs/2103.01786) | [code](https://github.com/xyvirtualgroup/MetaSCI-CVPR2021)

\n\n[1] FSDR: Frequency Space Domain Randomization for Domain Generalization(用于域推广的频域随机化)
\n[paper](https://arxiv.org/abs/2103.02370)

\n\n\n

\n\n \n\n### 度量学习(Metric Learning)\n\n[5] SLADE: A Self-Training Framework For Distance Metric Learning(SLADE：远程度量学习的自训练框架)
\n[paper](https://arxiv.org/abs/2011.10269)

\n\n[4] MetricOpt: Learning to Optimize Black-Box Evaluation Metrics(MetricOpt：学习优化黑盒评估指标)
\n[paper](https://arxiv.org/abs/2104.10631)

\n\n[3] Noise-resistant Deep Metric Learning with Ranking-based Instance Selection(具有基于排名的实例选择的抗噪深度度量学习)
\n[paper](https://arxiv.org/abs/2103.16047)

\n\n[2] Embedding Transfer with Label Relaxation for Improved Metric Learning(嵌入转移与标签松弛功能以改善度量学习)
\n[paper](https://arxiv.org/abs/2103.14908)

\n\n[1] Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales(动态度量学习：迈向可扩展的度量空间以适应多个语义尺度)
\n[paper](https://arxiv.org/pdf/2103.11781.pdf) | [code](https://github.com/SupetZYK/DynamicMetricLearning)

\n\n

\n\n \n\n## 对比学习(Contrastive Learning)\n\n[4] Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning(具有自监督对比学习的全幻灯片图像分类的双流多实例学习网络)
\n[paper](https://arxiv.org/abs/2011.08939) | [code](https://github.com/binli123/dsmil-wsi)

\n\n[3] Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification(基于对比学习的混合网络的长尾图像分类)
\n[paper](https://arxiv.org/abs/2103.14267)

\n\n[2] AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations from Self-Trained Negative Adversaries(有效对比自我训练的负面对抗无监督表示的对抗性对比)
\n[paper](https://arxiv.org/abs/2011.08435) | [code](https://github.com/maple-research-lab/AdCo) | [解读-AdCo基于对抗的对比学习](https://mp.weixin.qq.com/s/u7Lhzh8uYEEHfWiM32-4yQ)]\n

\n\n[1] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)
\n[paper](https://arxiv.org/abs/2012.03515)

\n\n\n

\n\n \n\n## 增量学习(Incremental Learning)\n\n\n\n[4] Few-Shot Incremental Learning with Continually Evolved Classifiers(借助不断发展的分类器进行少量增量学习)
\n[paper](https://arxiv.org/abs/2104.03047)

\n\n[3] DER: Dynamically Expandable Representation for Class Incremental Learning(于类增量学习的动态可扩展表示形式)
\n[paper](https://arxiv.org/abs/2103.16788)

\n\n[2] Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning(少类别增量学习的语义感知知识蒸馏)
\n[paper](https://arxiv.org/abs/2103.04059)

\n\n[1] On Learning the Geodesic Path for Incremental Learning(关于学习增量学习的测地线路径)
\n[paper](https://arxiv.org/abs/2104.08572)

\n\n\n

\n \n\n## 强化学习(Reinforcement Learning)\n\n[2] Unsupervised Visual Attention and Invariance for Reinforcement Learning(强化学习的无监督视觉注意和不变性)
\n[paper](https://arxiv.org/abs/2104.02921)

\n\n[1] Unsupervised Learning for Robust Fitting:A Reinforcement Learning Approach(无监督学习以进行稳健拟合：一种强化学习方法)
\n[paper](https://arxiv.org/abs/2103.03501)

\n\n

\n \n\n## 元学习(Meta Learning)\n\n[4] Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation(集群、拆分、融合和更新：开放复合域自适应语义分割的元学习)
\n[paper](https://openaccess.thecvf.com/content/CVPR2021/papers/Gong_Cluster_Split_Fuse_and_Update_Meta-Learning_for_Open_Compound_Domain_CVPR_2021_paper.pdf)

\n\n[3] Faster Meta Update Strategy for Noise-Robust Deep Learning(更快的元更新策略，适用于杂乱无章的深度学习)
\n[paper](https://arxiv.org/abs/2104.15092)

\n\n[2] Meta-Mining Discriminative Samples for Kinship Verification(进行亲缘关系验证的元挖掘歧视性样本)
\n[paper](https://arxiv.org/pdf/2103.15108.pdf)

\n\n[1] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition(MetaSAug：用于长尾视觉识别的元语义增强)
\n[paper](https://arxiv.org/pdf/2103.12579.pdf)

\n\n

\n\n \n\n## 暂无分类\n\nRotation-Only Bundle Adjustment(仅旋转束调整)
\n[paper](https://arxiv.org/abs/2011.11724)

\n\nHDR Environment Map Estimation for Real-Time Augmented Reality(用于实时增强现实的 HDR 环境地图估计)
\n[paper](https://arxiv.org/abs/2011.10687) | [video](https://docs-assets.developer.apple.com/ml-research/papers/hdr-environment-map.mp4)

\n\nMagic Layouts: Structural Prior for Component Detection in User Interface Designs(魔术布局：用户界面设计中组件检测的结构先验)
\n[paper](https://arxiv.org/abs/2106.07615)

\n\nUnsupervised Part Discovery via Feature Alignment(通过特征对齐的无监督零件发现)
\n[paper](https://arxiv.org/abs/2012.00313)

\n\nNeural Prototype Trees for Interpretable Fine-grained Image Recognition(【可解释性】用于可解释细粒度图像识别的神经原型树)
\n[paper](https://arxiv.org/abs/2012.02046) | [code](https://github.com/M-Nauta/ProtoTree)

\n\nHow Robust are Randomized Smoothing based Defenses to Data Poisoning?(基于随机平滑的数据中毒防御有多强健？)
\n[paper](https://arxiv.org/abs/2012.01274)

\n\nUncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces(一般表面光度立体的未校准神经逆渲染)
\n[paper](https://arxiv.org/abs/2012.06777)

\n\nThe Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models(计算机视觉模型中监督和自我监督预训练的彩票假设)
\n[paper](https://arxiv.org/abs/2012.06908) | [code](https://github.com/VITA-Group/CV_LTH_Pre-training)

\n\nThe Lottery Ticket Hypothesis for Object Recognition(用于对象识别的彩票假设)
\n[paper](https://arxiv.org/abs/2012.04643)

\n\nOne-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing(【视频合成】用于视频会议的一次性自由视角神经谈话头合成)
\n[paper](https://arxiv.org/abs/2011.15126) | [project](https://nvlabs.github.io/face-vid2vid)

\n\nStochastic Image-to-Video Synthesis using cINNs(【视频合成】使用cINN的随机图像到视频合成)
\n[paper](https://arxiv.org/abs/2105.04551) | [project](https://bit.ly/3t66bnU)
\n\nNeRD: Neural 3D Reflection Symmetry Detector(NeRD：神经3D反射对称检测器)
\n[paper](https://arxiv.org/abs/2105.03211) | [code](https://github.com/zhou13/nerd)

\n\nFunction4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors(【人体体积捕获】Function4D：从非常稀疏的消费类RGBD传感器实时采集人体体积)
\n[paper](https://arxiv.org/abs/2105.01859) | [project](http://www.liuyebin.com/Function4D/Function4D.html) | [video](https://www.youtube.com/watch?v=-rWUn4fEQNU&t=126s)
\n\nAutoFlow: Learning a Better Training Set for Optical Flow(AutoFlow：学习更好的光流训练集)
\n[paper](https://arxiv.org/abs/2104.14544) | [code](https://autoflow-google.github.io/)

\n\nShot Contrastive Self-Supervised Learning for Scene Boundary Detection(【场景边界检测】【对比学习】用于场景边界检测的镜头对比自我监督学习)
\n[paper](https://arxiv.org/abs/2104.13537)

\n\nPractical Wide-Angle Portraits Correction with Deep Structured Models(【人像校正】深度结构模型的实用广角人像校正)
\n[paper](https://arxiv.org/abs/2104.12464)

\n\nDeep Lucas-Kanade Homography for Multimodal Image Alignment(【图像对齐】用于多模态图像对齐的Deep-Lucas-Kanade单应)\n[paper](https://arxiv.org/pdf/2104.11693.pdf) | [code](https://github.com/placeforyiming/CVPR21-Deep-Lucas-Kanade-Homography)

\n\nHierarchical Motion Understanding via Motion Programs(【人体动作理解】基于运动程序的分层运动理解)
\n[paper](https://arxiv.org/abs/2104.11216) | [project](https://sumith1896.github.io/motion2prog/)

\n\nManipulaTHOR: A Framework for Visual Object Manipulation(操纵器：一个视觉对象操纵的框架)
\n[paper](https://arxiv.org/abs/2104.11213)

\n\nLearning To Count Everything(【视觉计数】学习计算一切)
\n[paper](https://arxiv.org/abs/2104.08391) | [dataset&code](https://github.com/cvlab-stonybrook/LearningToCountEverything)

\n\nEgo-Exo: Transferring Visual Representations from Third-person to First-person Videos(Ego-Exo：将视觉表示从第三人称视频转移到第一人称视频)
\n[paper](https://arxiv.org/pdf/2104.07905.pdf)

\n\nHarmonious Semantic Line Detection via Maximal Weight Clique Selection(【语义线检测】通过最大权重集团选择进行和谐的语义线检测)
\n[paper](https://arxiv.org/abs/2104.06903) | [code](https://github.com/dongkwonjin/Semantic-Line-MWCS)

\n\nNeural Camera Simulators(神经相机模拟器)
\n[paper](https://arxiv.org/abs/2104.05237)

\n\nAll Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training(并非所有标签都相等：通过标签分组和共同训练增强半监督)
\n[paper](https://arxiv.org/abs/2104.05248) | [code](https://github.com/islam-nassar/semco)

\n\nShape and Material Capture at Home(在家中进行形状和材料捕获)
\n[paper](https://arxiv.org/abs/2104.06397) ｜ [project](http://dlichy.github.io/ShapeAndMaterialAtHome/)

\n\nSOLD2: Self-supervised Occlusion-aware Line Description and Detection(【图像匹配】自我监督的遮挡感知线描述和检测)
\n[paper](https://arxiv.org/abs/2104.03362) | [code](https://github.com/cvg/SOLD2)

\n\nProgressive Temporal Feature Alignment Network for Video Inpainting(【视频修复】用于视频修复的渐进时间特征对齐网络)
\n[paper](https://arxiv.org/abs/2104.03507) | [code](https://github.com/MaureenZOU/TSAM)

\n\nA Decomposition Model for Stereo Matching(【立体声匹配】立体匹配的分解模型)
\n[paper](https://arxiv.org/abs/2104.07516)

\n\nCFNet: Cascade and Fused Cost Volume for Robust Stereo Matching(【立体声匹配】CFNet：稳健的立体声匹配的级联和融合成本)
\n[paper](https://arxiv.org/abs/2104.04314) | [code](https://github.com/gallenszl/CFNet)

\n\nSMD-Nets: Stereo Mixture Density Networks(【立体声匹配】立体声混合密度网络)
\n[paper](https://arxiv.org/abs/2104.03866) | [project](https://github.com/fabiotosi92/SMD-Nets)

\n\nDe-rendering the World's Revolutionary Artefacts(渲染世界革命文物)
\n[paper](https://arxiv.org/abs/2104.03954) | [project](https://sorderender.github.io/)

\n\nLearning Triadic Belief Dynamics in Nonverbal Communication from Videos(【视频摘要】从视频中学习非语言交流中的三重性信念动力学)
\n[paper](https://arxiv.org/abs/2104.02841)

\n\nBeyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories(超越短片：具有协作记忆的端到端视频级学习)
\n[paper](https://arxiv.org/abs/2104.01198)

\n\nPassive Inter-Photon Imaging(被动光子间成像)
\n[paper](https://arxiv.org/abs/2104.00059)

\n\nPhySG: Inverse Rendering with Spherical Gaussians for Physics-based Material Editing and Relighting(PhySG：球形高斯逆渲染，用于基于物理的材质编辑和重新照明)
\n[paper](https://arxiv.org/abs/2104.00674) | [project](https://kai-46.github.io/PhySG-website/)

\n\nLearning Camera Localization via Dense Scene Matching(【密集场景匹配】通过密集场景匹配学习相机定位)
\n[paper](https://arxiv.org/abs/2103.16792) | [code](https://github.com/Tangshitao/Dense-Scene-Matching)

\n\n\n\nSimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification(半监督分类的类似伪标签开发)
\n[paper](https://arxiv.org/abs/2103.16725) | [code](http://github.com/zijian-hu/SimPLE)

\n\nOnline Learning of a Probabilistic and Adaptive Scene Representation(概率自适应场景表示的在线学习)
\n[paper](https://arxiv.org/abs/2103.16832)

\n\nEmbracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding(拥抱不确定性：去耦和去偏置以实现可靠的实时落地)
\n[paper](https://arxiv.org/abs/2103.16848)

\n\nModel-Contrastive Federated Learning(模型对比联合学习)
\n[paper](https://arxiv.org/abs/2103.16257)

\n\nRepopulating Street Scenes(重新填充街景)
\n[paper](https://arxiv.org/abs/2103.16183)

\n\nVisual Room Rearrangement(视觉室重新布置)
\n[paper](https://arxiv.org/abs/2103.16544)

\n\nTuning IR-cut Filter for Illumination-aware Spectral Reconstruction from RGB(可调红外截止滤光片，用于从RGB感知照明的光谱重建)
\n[paper](https://arxiv.org/abs/2103.14708)

\n\nVideo Rescaling Networks with Joint Optimization Strategies for Downscaling and Upscaling(具有联合优化策略的视频缩放网络，用于缩小和放大)
\n[paper](https://arxiv.org/abs/2103.14858)

\n\nBilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction(用于域外人网格重构的双层在线适应)
\n[paper](https://arxiv.org/abs/2103.16449) | [project](https://sites.google.com/view/humanmeshboa)

\n\nPicasso: A CUDA-based Library for Deep Learning over 3D Meshes(【网格简化】毕加索：基于CUDA的3D网格深度学习库)
\n[paper](https://arxiv.org/abs/2103.15076) | [library](https://github.com/hlei-ziyan/Picasso)

\n\nCloud2Curve: Generation and Vectorization of Parametric Sketches(参数草图的生成和矢量化)
\n[paper](https://arxiv.org/abs/2103.15536)

\n\nLearning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression(【不确定性学习】学习概率序数嵌入以进行不确定性感知回归)
\n[paper](https://arxiv.org/abs/2103.13629)

\n\nSSLayout360: Semi-Supervised Indoor Layout Estimation from 360◦ Panorama(【布局估计】360°全景图的半监督室内布局估计)
\n[paper](https://arxiv.org/abs/2103.13696)

\n\nConvex Online Video Frame Subset Selection using Multiple Criteria for Data Efficient Autonomous Driving(使用多种标准的凸面在线视频帧子集选择，以实现数据高效自动驾驶)
\n[paper](https://arxiv.org/pdf/2103.13021.pdf)

\n\nScene-Intuitive Agent for Remote Embodied Visual Grounding(场景直观的代理，用于远程实现可视化接地)
\n[paper](https://arxiv.org/abs/2103.12944)

\n\nRelation-aware Instance Refinement for Weakly Supervised Visual Grounding(【visual grounding】弱监督视觉接地的关系感知实例细化)
\n[paper](https://arxiv.org/pdf/2103.12989.pdf) | [code](https://github.com/youngfly11/ReIR-WeaklyGrounding.pytorch.git)

\n\nContext-aware Biaffine Localizing Network for Temporal Sentence Grounding(上下文感知的Biaffine本地化网络，用于临时Sentence Grounding)
\n[paper](https://arxiv.org/pdf/2103.11555.pdf)

\n\nDynamic Face Video Segmentation via Reinforcement Learning(通过强化学习进行动态人脸视频分割)
\n[paper](https://arxiv.org/pdf/1907.01296) | [code](https://github.com/mapleandfire/300VW-Mask)

\n\nBack to the Feature: Learning Robust Camera Localization from Pixels to Pose(从像素到姿势学习可靠的相机定位)
\n[paper](https://arxiv.org/pdf/2103.09213.pdf) | [code](https://github.com/cvg/pixloc)

\n\n\nRotation Coordinate Descent for Fast Globally Optimal Rotation Averaging(【优化】旋转坐标下降用于快速全局最优旋转平均)
\n[paper](https://arxiv.org/pdf/2103.08292.pdf)

\n\nAffect2MM: Affective Analysis of Multimedia Content Using Emotion Causality(使用情感因果关系对多媒体内容进行情感分析)
\n[paper](https://arxiv.org/pdf/2103.06541.pdf)

\n\nDeep Graph Matching under Quadratic Constraint(【图匹配】二次约束下的深度图匹配)
\n[paper](https://arxiv.org/pdf/2103.06643.pdf)

\n\nDeep Gaussian Scale Mixture Prior for Spectral Compressive Imaging(用于光谱压缩成像的深高斯比例混合气)
\n[paper](https://arxiv.org/pdf/2103.07152.pdf) | [code](https://see.xidian.edu.cn/faculty/wsdong/Projects/DGSM-SCI.htm)

\n\nLimitations of Post-Hoc Feature Alignment for Robustness(健壮性的赛后特征对齐的局限性)
\n[paper](https://arxiv.org/abs/2103.05898)

\n\nConsensus Maximisation Using Influences of Monotone Boolean Functions(利用单调布尔函数的影响实现共识最大化)
\n[paper](https://arxiv.org/pdf/2103.04200.pdf)

\n\nNutrition5k: Towards Automatic Nutritional Understanding of Generic Food(实现对通用食品的自动营养理解)
\n[paper](https://arxiv.org/abs/2103.03375)

\n\nStructured Scene Memory for Vision-Language Navigation(用于视觉语言导航的结构化场景存储器)
\n[paper](https://arxiv.org/abs/2103.03454) | [code](https://github.com/HanqingWangAI/SSM-VLN)

\n\nLearning Asynchronous and Sparse Human-Object Interaction in Videos(视频中异步稀疏人-物交互的学习)
\n[paper](https://arxiv.org/abs/2103.02758)

\n\nSelf-supervised Geometric Perception(自我监督的几何知觉)
\n[paper](https://arxiv.org/abs/2103.03114)

\n\nQuantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)
\n[paper](https://arxiv.org/pdf/2011.12646.pdf)

\n\nExploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)
\n[paper](http://arxiv.org/abs/2012.09165) | [project](http://sekunde.github.io/project_efficient) | [video](http://youtu.be/E70xToZLgs4)

\n\nData-Free Model Extraction(无数据模型提取)
\n[paper](https://arxiv.org/abs/2011.14779)
\n\nPatch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition(用于【位置识别】的局部全局描述符的【多尺度融合】)
\n[paper](https://arxiv.org/pdf/2103.01486.pdf) | [code](https://github.com/QVPR/Patch-NetVLAD)

\n\nRight for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations(适用于正确概念的权利：通过可解释性来修正神经符号概念)
\n[paper](https://arxiv.org/abs/2011.12854)

\n\nHierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph(基于目标关系图的分层部分可观测目标驱动策略学习)
\n[paper](https://arxiv.org/abs/2103.01350)

\n\nDomain Generalization via Inference-time Label-Preserving Target Projections（通过保留推理时间的目标投影进行域泛化）
\n[paper](https://arxiv.org/pdf/2103.01134.pdf)

\n\nDeRF: Decomposed Radiance Fields（分解的辐射场）
\n[project](https://ubc-vision.github.io/derf/)

\n\nMulti-Objective Interpolation Training for Robustness to Label Noise(多目标插值训练的鲁棒性)
\n[paper](https://arxiv.org/abs/2012.04462) | [code](https://git.io/JI40X)

\n\nCDFI: Compression-Driven Network Design for Frame Interpolation(用于帧插值的压缩驱动网络设计)
\n[paper](https://arxiv.org/pdf/2103.10559.pdf) | [code](https://github.com/tding1/CDFI)

\n\nFLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation（【视频插帧】FLAVR：用于快速帧插值的与流无关的视频表示）
\n[paper](https://arxiv.org/pdf/2012.08512.pdf) | [code](https://tarun005.github.io/FLAVR/Code) | [project](https://tarun005.github.io/FLAVR/)

\n\nDeep Animation Video Interpolation in the Wild(【视频插帧】野外深度动画视频插帧)
\n[paper](https://arxiv.org/abs/2104.02495) | [code&dataset](https://github.com/lisiyao21/AnimeInterp/)

\n\nProbabilistic Embeddings for Cross-Modal Retrieval（跨模态检索的概率嵌入）
\n[paper](https://arxiv.org/abs/2101.05068)

\n\nSelf-supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map(道路动力学和成本图的自监督式多步同时预测)

\n\nIIRC: Incremental Implicitly-Refined Classification(增量式隐式定义的分类)
\n[paper](https://arxiv.org/abs/2012.12477) | [project](https://chandar-lab.github.io/IIRC/)

\n\nFair Attribute Classification through Latent Space De-biasing(通过潜在空间去偏的公平属性分类)
\n[paper](https://arxiv.org/abs/2012.01469) | [code](https://github.com/princetonvisualai/gan-debiasing) | [project](https://princetonvisualai.github.io/gan-debiasing/)

\n\nInformation-Theoretic Segmentation by Inpainting Error Maximization(修复误差最大化的信息理论分割)
\n[paper](https://arxiv.org/abs/2012.07287)

\n\nKaleido-BERT: Vision-Language Pre-training on Fashion Domain(Kaleido-BERT：时尚领域的视觉语言预训练)
\n[paper](https://arxiv.org/abs/2103.16110) | [code](https://github.com/mczhuge/Kaleido-Bert)

\n\nUC2: Universal Cross-lingual Cross-modal Vision-and-Language Pretraining(【视频语言学习】UC2：通用跨语言跨模态视觉和语言预培训)

\n\nLess is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)
\n[paper](https://arxiv.org/pdf/2102.06183.pdf) | [code](https://github.com/jayleicn/ClipBERT)

\n\nD-NeRF: Neural Radiance Fields for Dynamic Scenes(D-NeRF：动态场景的神经辐射场)
\n[paper](https://arxiv.org/abs/2011.13961) | [project](https://www.albertpumarola.com/research/D-NeRF/index.html)

\n\nFlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation(FlowStep3D：用于自监督场景流估计的模型展开)
\n[paper](https://arxiv.org/abs/2011.10147)

\n\nWeakly Supervised Learning of Rigid 3D Scene Flow(刚性3D【场景流】的弱监督学习)
\n[paper](https://arxiv.org/pdf/2102.08945.pdf) | [code](https://arxiv.org/pdf/2102.08945.pdf) | [project](https://3dsceneflow.github.io/)

\n\n
\n\n \n\n\n# 2. CVPR2021 Oral\n\n[106] Fully Convolutional Networks for Panoptic Segmentation(Oral | 用于全景分割的全卷积网络)
\n[paper](https://arxiv.org/abs/2012.00720) | [paper](https://github.com/Jia-Research-Lab/PanopticFCN)

\n\n[105] SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation(SSTVOS：用于视频对象分割的稀疏时空变换器)
\n[paper](https://arxiv.org/abs/2101.08833) ｜ [code](https://github.com/dukebw/SSTVOS)

\n\n[104] Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning(具有自监督对比学习的全幻灯片图像分类的双流多实例学习网络)
\n[paper](https://arxiv.org/abs/2011.08939) | [code](https://github.com/binli123/dsmil-wsi)

\n\n[103] Shared Cross-Modal Trajectory Prediction for Autonomous Driving(自动驾驶的共享跨模态轨迹预测)
\n[paper](https://arxiv.org/abs/2011.08436)

\n\n[102] Open-Vocabulary Object Detection Using Captions(使用字幕的开放词汇对象检测)
\n[paper](https://arxiv.org/abs/2011.10678)

\n\n[101] Task Programming: Learning Data Efficient Behavior Representations(任务编程：学习数据高效行为表征)
\n[paper](https://arxiv.org/abs/2011.13917) ｜ [code](https://github.com/neuroethology/TREBA) | [project](https://sites.google.com/view/task-programming)

\n\n[100] One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing(用于视频会议的一次性自由视角神经谈话头合成)
\n[paper](https://arxiv.org/abs/2011.15126) | [project](https://nvlabs.github.io/face-vid2vid)

\n\n[99] Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization(通过对比交叉视图互信息最大化学习视图解开人体姿势表示)
\n[paper](https://arxiv.org/abs/2012.01405) | [code](https://github.com/google-research/google-research/tree/master/poem)

\n\n[98] Correlated Input-Dependent Label Noise in Large-Scale Image Classification(大规模图像分类中的关联输入相关标签噪声)
\n[paper](https://arxiv.org/abs/2105.10305)

\n\n[97] A Fourier-based Framework for Domain Generalization(基于傅立叶的域泛化框架)
\n[paper](https://arxiv.org/abs/2105.11120)

\n\n[96] Omnimatte: Associating Objects and Their Effects in Video(Omnimatte：在视频中关联对象及其效果)
\n[paper](https://arxiv.org/abs/2105.06993) | [project](https://omnimatte.github.io/)

\n\n[95] CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation(CoCosNet v2：用于图像翻译的全分辨率函授学习)
\n[paper](https://arxiv.org/abs/2012.02047)

\n\n[94] MOS: Towards Scaling Out-of-distribution Detection for Large Semantic Space(MOS：面向大型语义空间的规模化异常样本检测)
\n[paper](https://arxiv.org/abs/2105.01879)

\n\n[93] Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors(【人体体积捕获】Function4D：从非常稀疏的消费类RGBD传感器实时采集人体体积)
\n[paper](https://arxiv.org/abs/2105.01859) | [project](http://www.liuyebin.com/Function4D/Function4D.html) | [video](https://www.youtube.com/watch?v=-rWUn4fEQNU&t=126s)
\n\n[92] Deep Polarization Imaging for 3D shape and SVBRDF Acquisition(用于3D形状和SVBRDF采集的深偏振成像)
\n[paper](https://arxiv.org/abs/2105.02875)

\n\n[91] GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving(GeoSim：通过可自动驾驶的几何感知合成进行逼真的视频模拟)
\n[paper](https://arxiv.org/abs/2101.06543)

\n\n[90] GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields(GIRAFFE：将场景表示为合成的生成神经特征场)
\n[paper](https://arxiv.org/abs/2011.12100) | [project](http://bit.ly/giraffe-project)

\n\n[89] DriveGAN: Towards a Controllable High-Quality Neural Simulation(DriveGAN：迈向可控的高质量神经仿真)
\n[paper](https://arxiv.org/abs/2104.15060)

\n\n[88] HOTR: End-to-End Human-Object Interaction Detection with Transformers(HOTR：使用变压器进行端到端的人与对象交互检测)
\n[paper](https://arxiv.org/abs/2104.13682)

\n\n[87] FrameExit: Conditional Early Exiting for Efficient Video Recognition(【视频理解】帧退出：有条件提前退出以实现有效的视频识别)
\n[paper](https://arxiv.org/abs/2104.13400)

\n\n[86] Unsupervised Multi-Source Domain Adaptation for Person Re-Identification(用于行人重新识别的无监督多源域适配)
\n[paper](https://arxiv.org/abs/2104.12961)

\n\n[85] Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets(寻求有效注释大型图像分类数据集的良好做法)
\n[paper](https://arxiv.org/abs/2104.12690) | [project](https://fidler-lab.github.io/efficient-annotation-cookbook)

\n\n[84] KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control(【3D关键点】关键点变形器：用于形状控制的无监督三维关键点发现)
\n[paper](https://arxiv.org/abs/2104.11224) | [project](http://tomasjakab.github.io/KeypointDeformer)

\n\n[85] ManipulaTHOR: A Framework for Visual Object Manipulation(操纵器：一个视觉对象操纵的框架)
\n[paper](https://arxiv.org/abs/2104.11213)

\n\n[84] Variational Relational Point Completion Network(变分关系点完备网络)
\n[paper](https://arxiv.org/abs/2104.10154) | [project](https://paul007pl.github.io/projects/VRCNet.html)

\n\n[83] Style-Aware Normalized Loss for Improving Arbitrary Style Transfer(一种改进任意风格转换的风格感知归一化损失算法)
\n[paper](https://arxiv.org/abs/2104.10064)

\n\n[82] Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps(基于可靠性的注意映射引导交互式视频对象分割)
\n[paper](https://arxiv.org/abs/2104.10386) | [code](https://github.com/yuk6heo/GIS-RAmap)

\n\n[81] MetricOpt: Learning to Optimize Black-Box Evaluation Metrics(MetricOpt：学习优化黑盒评估指标)
\n[paper](https://arxiv.org/abs/2104.10631)

\n\n[80] LAFEAT: Piercing Through Adversarial Defenses with Latent Features(LAFEAT：通过具有潜在功能的对抗性防御突围)
\n[paper](https://arxiv.org/abs/2104.09284)

\n\n[79] Single-view robot pose and joint angle estimation via render & compare(通过渲染和比较进行单视图机器人姿态和关节角度估计)
\n[paper](https://arxiv.org/abs/2104.09359) | [code](https://www.di.ens.fr/willow/research/robopose/)

\n\n[78] Temporal Query Networks for Fine-grained Video Understanding(时间查询网络，用于细粒度的视频理解)
\n[paper](https://arxiv.org/abs/2104.09496) | [project](http://www.robots.ox.ac.uk/~vgg/research/tqn/)

\n\n[77] Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction(车道感知不同轨迹预测的分而治之)
\n[paper](https://arxiv.org/abs/2104.08277)

\n\n[76] Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty(新旧融合：通过几何引导的不确定性学习相对相机姿势)
\n[paper](https://arxiv.org/abs/2104.08278)

\n\n[75] DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort(DatasetGAN：只需最少的人力即可获得的高效标签数据工厂)
\n[paper](https://arxiv.org/abs/2104.06490)

\n\n[74] Pixel Codec Avatars(像素编解码器头像)
\n[paper](https://arxiv.org/abs/2104.04638)

\n\n[73] CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo(CodedStereo：为大景深立体声而设计的相位掩模)
\n[paper](https://arxiv.org/abs/2104.04641)

\n\n[72] Rethinking and Improving the Robustness of Image Style Transfer(重新思考和改善图像风格迁移的鲁棒性)
\n[paper](https://arxiv.org/abs/2104.05623)

\n\n[71] Simpler Certified Radius Maximization by Propagating Covariances(通过传播协方差简化认证半径最大化)
\n[paper](https://arxiv.org/abs/2104.05888) | [video](https://youtu.be/m1ya2oNf5iE)

\n\n[70] Global Transport for Fluid Reconstruction with Learned Self-Supervision(具有自学指导的流体重建的全球运输)
\n[paper](https://arxiv.org/abs/2104.06031) | [code](https://github.com/tum-pbs/Global-Flow-Transport)

\n\n[69] SOLD2: Self-supervised Occlusion-aware Line Description and Detection(【图像匹配】自我监督的遮挡感知线描述和检测)
\n[paper](https://arxiv.org/abs/2104.03362) | [code](https://github.com/cvg/SOLD2)

\n\n[68] InverseForm: A Loss Function for Structured Boundary-Aware Segmentation(结构化边界感知分割的损失函数)
\n[paper](https://arxiv.org/abs/2104.02745)

\n\n[67] Learning Triadic Belief Dynamics in Nonverbal Communication from Videos(【视频摘要】从视频中学习非语言交流中的三重性信念动力学)
\n[paper](https://arxiv.org/abs/2104.02841)

\n\n[66] Adversarial Robustness under Long-Tailed Distribution(长尾分布下的对抗鲁棒性)
\n[paper](https://arxiv.org/abs/2104.02703) | [code](https://github.com/wutong16/Adversarial_Long-Tail)

\n\n[65] S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation(学习通用的深度特定的结构表示)
\n[paper](https://arxiv.org/abs/2104.00877)

\n\n[64] Video Prediction Recalling Long-term Motion Context via Memory Alignment Learning(【视频预测】通过记忆对准学习的视频预测调用长期运动环境)
\n[paper](https://arxiv.org/abs/2104.00924)

\n\n[63] Passive Inter-Photon Imaging(被动光子间成像)
\n[paper](https://arxiv.org/abs/2104.00059)

\n\n[62] Jigsaw Clustering for Unsupervised Visual Representation Learning(拼图聚类的无监督视觉表示学习)
\n[paper](https://arxiv.org/abs/2104.00323) | [code](https://github.com/Jia-Research-Lab/JigsawClustering)

\n\n[61] Reconstructing 3D Human Pose by Watching Humans in the Mirror(通过照镜子中的人来重建3D人的姿势)
\n[paper](https://arxiv.org/abs/2104.00340) | [project](https://zju3dv.github.io/Mirrored-Human/)

\n\n[60] Towards Evaluating and Training Verifiably Robust Neural Networks(评估和训练可验证的稳健神经网络)
\n[paper](https://arxiv.org/abs/2104.00447) | [code](https://github.com/ZhaoyangLyu/VerifiablyRobustNN)

\n\n[59] LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering(通过可分辨深度渲染进行单眼360布局估算)
\n[paper](https://arxiv.org/abs/2104.00568) | [project](https://fuenwang.ml/project/led2net)

\n\n[58] A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification(细粒度分类的半监督学习的现实评估)
\n[paper](https://arxiv.org/abs/2104.00679)

\n\n[57] SimPoE: Simulated Character Control for 3D Human Pose Estimation(用于3D人体姿势估计的模拟角色控制)
\n[paper](https://arxiv.org/abs/2104.00683) | [project](https://www.ye-yuan.com/simpoe/)

\n\n[56] DER: Dynamically Expandable Representation for Class Incremental Learning(【增量学习】用于类增量学习的动态可扩展表示形式)
\n[paper](https://arxiv.org/abs/2103.16788)

\n\n[55] Convolutional Hough Matching Networks(卷积霍夫匹配网络)
\n[paper](https://arxiv.org/abs/2103.16831)

\n\n[54] A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection(仔细研究CNN生成图像检测的傅立叶光谱差异)
\n[paper](https://arxiv.org/abs/2103.17195) | [code](https://keshik6.github.io/Fourier-Discrepancies-CNN-Detection/)

\n\n[53] DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation(DiNTS：用于3D医学图像分割的可区分神经网络拓扑搜索)
\n[paper](https://arxiv.org/abs/2103.15954)

\n\n[52] Face Forensics in the Wild(人脸伪造数据集)
\n[paper](https://arxiv.org/abs/2103.16076) | [paper](https://github.com/tfzhou/FFIW)

\n\n[51] Fully Convolutional Scene Graph Generation(全卷积场景图生成)
\n[paper](https://arxiv.org/abs/2103.16083)

\n\n[50] Visual Room Rearrangement(视觉室重新布置)
\n[paper](https://arxiv.org/abs/2103.16544)

\n\n[49] Adaptive Methods for Real-World Domain Generalization(真实世界域自适应的自适应方法)
\n[paper](https://arxiv.org/abs/2103.15796)

\n\n[48] Tuning IR-cut Filter for Illumination-aware Spectral Reconstruction from RGB(可调红外截止滤光片，用于从RGB感知照明的光谱重建)
\n[paper](https://arxiv.org/abs/2103.14708)

\n\n[47] Learning Placeholders for Open-Set Recognition(学习占位符以进行开放式识别)
\n[paper](https://arxiv.org/abs/2103.15086)

\n\n[46] Zero-shot Adversarial Quantization(零样本对抗量化)
\n[paper](https://arxiv.org/abs/2103.15263) | [code](https://git.io/Jqc0y)

\n\n[45] POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture(用于单视图人体体积捕获的姿势引导选择性融合)
\n[paper](https://arxiv.org/abs/2103.15331) | [project](http://www.liuyebin.com/posefusion/posefusion.html)

\n\n[44] RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening(通过实例选择性增白提高城市场景分割中的域泛化)
\n[paper](https://arxiv.org/abs/2103.15597) | [code](https://github.com/shachoi/RobustNet)

\n\n[43] Bidirectional Projection Network for Cross Dimension Scene Understanding(【场景理解】双向投影网络，用于跨维度场景理解)
\n[paper](https://arxiv.org/abs/2103.14326) | [code](https://github.com/wbhu/BPNet)

\n\n[42] Dynamic Slimmable Network(动态可压缩网络)
\n[paper](https://arxiv.org/abs/2103.13258) | [code](https://github.com/changlin31/DS-Net)

\n\n[41] Scaling Local Self-Attention For Parameter Efficient Visual Backbones(扩展局部自注意力以获得有效的参数视觉主干)
\n[paper](https://arxiv.org/pdf/2103.12731.pdf)

\n\n[40] PGT: A Progressive Method for Training Models on Long Videos(一种在长视频上训练模型的渐进方法)
\n[paper](https://arxiv.org/pdf/2103.11313.pdf) | [code](https://github.com/BoPang1996/PGT)

\n\n[39] Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net(无监督多元规范CSCℓ4Net的脑图像合成)
\n[paper](https://arxiv.org/pdf/2103.11587.pdf)

\n\n[38] Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking(Transformer与追踪器相遇：利用时间上下文进行可靠的视觉追踪)
\n[paper](https://arxiv.org/pdf/2103.11681)

\n\n[37] Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion(具有深度敏感注意力和自动多模态融合的深度RGB-D显著性检测)
\n[paper](https://arxiv.org/abs/2103.11832)

\n\n[36] Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging(【优化】旋转坐标下降用于快速全局最优旋转平均)
\n[paper](https://arxiv.org/pdf/2103.08292.pdf)

\n\n[35] MagFace: A Universal Representation for Face Recognition and Quality Assessment(MagFace：人脸识别和质量评估的通用表示形式)
\n[paper](https://arxiv.org/abs/2103.06627) | [code](https://github.com/IrvingMeng/MagFace)

\n\n[34] CoMoGAN: continuous model-guided image-to-image translation(连续的模型指导的图像到图像翻译)
\n[paper](https://arxiv.org/abs/2103.06879) | [code](http://github.com/cv-rits/CoMoGAN)

\n\n[33] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism(具有分离旋转机制的类别级6D对象姿势估计的快速基于形状的网络)
\n[paper](https://arxiv.org/abs/2103.07054)

\n\n[32] Knowledge Evolution in Neural Networks(神经网络中的知识进化)
\n[paper](https://arxiv.org/pdf/2103.05152.pdf) | [code](https://github.com/ahmdtaha/knowledge_evolution)

\n\n[31] NeX: Real-time View Synthesis with Neural Basis Expansion(NeX：具有神经基础扩展的实时视图合成)
\n[paper](https://arxiv.org/abs/2103.05606) | [code](https://nex-mpi.github.io/)

\n\n[30] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis(进行全面伪造分析的多功能基准)
\n[paper](https://arxiv.org/abs/2103.05630) | [code](https://yinanhe.github.io/projects/forgerynet.html)

\n\n[29] Dense Contrastive Learning for Self-Supervised Visual Pre-Training(自监督视觉预训练的密集对比学习)
\n[paper](https://arxiv.org/abs/2011.09157) | [code](https://github.com/WXinlong/DenseCL)

\n\n[28] Consensus Maximisation Using Influences of Monotone Boolean Functions(利用单调布尔函数的影响实现共识最大化)
\n[paper](https://arxiv.org/pdf/2103.04200.pdf)

\n\n[27] Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing(用于实例感知人类语义解析的可微分多粒度人类表示学习)
\n[paper](https://arxiv.org/pdf/2103.04570.pdf) | [code](https://github.com/tfzhou/MG-HumanParsing)

\n\n[26] Discovering Hidden Physics Behind Transport Dynamics(在运输动力学背后发现隐藏的物理)
\n[paper](https://arxiv.org/pdf/2011.12222v1.pdf)

\n\n[25] Learning Continuous Image Representation with Local Implicit Image Function(通过局部隐含图像功能学习连续图像表示)
\n[paepr](https://arxiv.org/abs/2012.09161) | [code](https://github.com/yinboc/liif) | [video](https://youtu.be/6f2roieSY_8) | [project](https://yinboc.github.io/liif/) | [解读-真正的无极放大！30x插值效果惊艳，英伟达等开源LIIF：巧妙的图像超分新思路](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247529081&idx=1&sn=2f6fa79081a85b21bbdb10ace3852f78&chksm=ec1c9f80db6b16969f049f6c389dc9250ba661c31bad69f3490eb0b9561f264f19434a65c458&token=2008688100&lang=zh_CN#rd)

\n\n[24] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
\n[paper](https://arxiv.org/pdf/2011.09094.pdf) | [code](https://github.com/dddzg/up-detr)
\n解读：[无监督预训练检测器](https://www.zhihu.com/question/432321109/answer/1606004872)

\n\n[23] Self-supervised Geometric Perception(自我监督的几何知觉)
\n[paper](https://arxiv.org/abs/2103.03114)

\n\n[22] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)
\n[paper](https://arxiv.org/abs/2103.02772)

\n\n[21] Modeling Multi-Label Action Dependencies for Temporal Action Localization(为时间动作定位建模多标签动作相关性)
\n[paper](https://arxiv.org/pdf/2103.03027.pdf)

\n\n[20] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)

\n\n[19] Real-Time High Resolution Background Matting(实时高分辨率背景抠像)
\n[paper](https://arxiv.org/abs/2012.07810) | [code](https://github.com/PeterL1n/BackgroundMattingV2) | [project](https://grail.cs.washington.edu/projects/background-matting-v2/) | [video](https://youtu.be/oMfPTeYDF9g)

\n\n[18] Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)
\n[paper](http://arxiv.org/abs/2012.09165) | [project](http://sekunde.github.io/project_efficient) | [video](http://youtu.be/E70xToZLgs4)

\n\n[17] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中，通过空间划分的鲁棒神经路由可实现摄像机的重新定位)
\n[paper](https://arxiv.org/abs/2012.04746) | [project](https://ai.stanford.edu/~hewang/)

\n\n[16] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)
\n[paper](https://arxiv.org/pdf/2101.06605.pdf) | [code](https://github.com/huangjh-pub/multibody-sync)

\n\n[15] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)
\n[paper](https://arxiv.org/abs/2103.01100)

\n\n[14] PatchmatchNet: Learned Multi-View Patchmatch Stereo(学习多视图立体声)
\n[paper](https://arxiv.org/abs/2012.01411) | [code](https://github.com/FangjinhuaWang/PatchmatchNet)\n\n[13] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)
\n[paper](https://arxiv.org/abs/2012.04324)

\n\n[12] Single-Stage Instance Shadow Detection with Bidirectional Relation Learning(具有双向关系学习的单阶段实例阴影检测)

\n\n[11] Neural Geometric Level of Detail:Real-time Rendering with Implicit 3D Surfaces(神经几何细节水平：隐式3D曲面的实时渲染)
\n[paper](https://arxiv.org/abs/2101.10994) | [code](https://github.com/nv-tlabs/nglod) | [project](https://nv-tlabs.github.io/nglod/)

\n\n[9] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器：低重叠的3D点云的配准)
\n[paper](https://arxiv.org/pdf/2011.13005.pdf) | [code](https://github.com/ShengyuH/OverlapPredator) | [project](https://overlappredator.github.io/)

\n\n[8] Domain Generalization via Inference-time Label-Preserving Target Projections(通过保留推理时间的目标投影进行域泛化)
\n[paper](https://arxiv.org/abs/2103.01134)

\n\n[7] Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction(全局一致的非刚性重建的神经变形图)
\n[paper](https://arxiv.org/abs/2012.01451) | [project](https://aljazbozic.github.io/neural_deformation_graphs/) | [video](https://youtu.be/vyq36eFkdWo)

\n\n[6] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)
\n[paper](https://arxiv.org/abs/2012.03515)

\n\n[5] Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)
\n[paper](https://arxiv.org/pdf/2102.06183.pdf) | [code](https://github.com/jayleicn/ClipBERT)

\n\n[4] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)
\n[paper](https://arxiv.org/abs/2103.02584)

\n\n[3] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)
\n[paper](https://arxiv.org/abs/2103.01456) | [code](https://github.com/imlixinyang/HiSD)

\n\n[2] Towards Open World Object Detection(开放世界中的目标检测)
\n[paper](https://arxiv.org/abs/2103.02603) | [code](https://github.com/JosephKJ/OWOD)

\n\n[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)
\n[paper](https://arxiv.org/abs/2011.14503)

\n\n
\n\n \n\n# CVPR2021 论文解读汇总\n\n【13】[无监督预训练检测器(CVPR2021 Oral)](https://zhuanlan.zhihu.com/p/313684358)
\n无监督预训练模型无论是在nlp(BERT,GPT,XLNet)还是在cv(MoCo,SimCLR,BYOL)上都取得了突破性的进展。而对于无监督（自监督）预训练而言，最重要的就是设计一个合理的pretext，典型的像BERT的masked language model，MoCo的instance discrimination。他们都通过一定的方式，从样本中无监督的构造了一个\"label\"，从而对模型进行预训练，提高下游任务的表现。那么，对于DETR而言，既然CNN可以是无监督预训练的，那么transformer能不能也无监督预训练一下？
\n[paper](https://arxiv.org/abs/2011.09094) | [code](https://github.com/dddzg/up-detr)\n

\n\n【12】[GFLV2：目标检测良心技术，无Cost涨点！](https://zhuanlan.zhihu.com/p/313684358)
\n本文是检测领域首次引入用边界框的不确定性的统计量来高效地指导定位质量估计，从而基本无cost（包括在训练和测试阶段）地提升one-stage的检测器性能，涨幅在1~2个点AP。
\n[paper](https://arxiv.org/abs/2011.12885) | [code](https://github.com/implus/GFocalV2)\n

\n\n【11】[DCL：旋转目标检测新方法](https://zhuanlan.zhihu.com/p/354373013)
\nDensely Coded Labels (DCL)是 Circular Smooth Label (CSL)的优化版本。DCL主要从两方面进行了优化：过于厚重的预测层以及对类正方形目标检测的不友好。
\n[paper](https://arxiv.org/abs/2011.09670) | [code](https://github.com/yangxue0827/RotationDetection)\n

\n\n【10】[层次风格解耦：人脸多属性篡改终于可控了(CVPR2021 Oral)](https://zhuanlan.zhihu.com/p/354258056)
\n从CycleGAN提出后，图像翻译面临的最大的两个问题就是扩展性（同时处理多种篡改）和多样性（生成不同的结果），然而，一直没有一个很好的方法，可以兼顾扩展性和多样性的同时，又能使得这种篡改满足预期。例如，对于人脸属性篡改任务，我们想要给人脸加上刘海，可是却改变了发色或是背景，再例如，我们想要给人脸加上眼睛，结果竟然性别和年龄也改变了。HiSD就是为了解决这些问题，并且还同时支持从噪声中生成或者从图像中提取这样的风格。
\n[paper](https://arxiv.org/abs/2103.01456) | [code](https://github.com/imlixinyang/HiSD)\n

\n\n【9】[Transformer再下一城！low-level多个任务榜首被占领，北大华为等联合提出预训练模型IPT](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247524529&idx=1&sn=e39e67981b2afd9a5369cc843ddf28fe&chksm=ec1c8d48db6b045e7cf2d37c5633da8d3caf5e53178fe6df0913f2a04bc143fcc2e504d6f9be&token=2008688100&lang=zh_CN#rd)
\n对low-level计算机视觉任务（比如降噪、超分、去雨）进行了研究并提出了一种新的预训练模型：IPT(image processing transformer)。为最大挖掘transformer的能力，作者采用知名的ImageNet制作了大量的退化图像数据对，然后采用这些训练数据对对所提IPT(它具有多头、多尾以适配多种退化降质模型)模型进行训练。此外，作者还引入了对比学习以更好的适配不同的图像处理任务。经过微调后，预训练模型可以有效的应用不到的任务中。仅仅需要一个预训练模型，IPT即可在多个low-level基准上取得优于SOTA方案的性能。
\n[paper](https://arxiv.org/abs/2012.00364)\n

\n\n【8】[真正的无极放大！30x插值效果惊艳，英伟达等开源LIIF：巧妙的图像超分新思路](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247529081&idx=1&sn=2f6fa79081a85b21bbdb10ace3852f78&chksm=ec1c9f80db6b16969f049f6c389dc9250ba661c31bad69f3490eb0b9561f264f19434a65c458&token=2008688100&lang=zh_CN#rd)
\n一种新颖的连续图像表达方案。它在离散2D图像与连续2D图像之间构建了一种巧妙的连接。受益于所提方法的“连续表达”，它能够对图像进行分辨率调整，做到了真正意义上的“无极放大”，甚至可以进行30x的放大处理。
\n[paper](https://arxiv.org/abs/2012.09161) | [code](https://github.com/yinboc/liif) | [video](https://youtu.be/6f2roieSY_8) | [project](https://yinboc.github.io/liif/)\n

\n\n【7】[AdCo基于对抗的对比学习](https://mp.weixin.qq.com/s/u7Lhzh8uYEEHfWiM32-4yQ)
\n自监督学习领域，基于contrastive learning（对比学习）的思路已经在下游分类检测和任务中取得了明显的优势。其中如何充分利用负样本提高学习效率和学习效果一直是一个值得探索的方向，本文第一次全新提出了用对抗的思路end-to-end来直接学习负样本，在ImageNet和下游任务均达到SOTA。AdCo仅仅用8196个负样本（八分之一的MoCo v2的负样本量），就能达到与之相同的精度。同时，这些可直接训练的负样本在和BYOL中Prediction MLP参数量相同的情况下依然能够取得相似的效果。这说明了在自监督学习时代，通过将负样本可学习化，对比学习仍然具有学习效率高、训练稳定和精度高等一系列优势。
\n[paper](https://arxiv.org/abs/2011.08435) | [code](https://github.com/maple-research-lab/AdCo)\n

\n\n【6】[超分性能不降低，计算量降低50%：加速图像超分的ClassSR](https://zhuanlan.zhihu.com/p/355873199)
\n本文是在low-level领域关于超分网络加速的一次探索。它创新性的将分类与超分进行了融合，根据不同子块的复原难度自适应选择合适的超分分支以降低整体计算复杂度：复原难度低的平坦区域选择复杂度低的超分分支，复原难度高的纹理区域选择复杂度高的超分分支。在不降低超分性能的情况下，该方法可以最高可以节省50%的计算量。
\n[paper](https://arxiv.org/abs/2103.04039)\n

\n\n【5】[ MotionRNN：针对复杂时空运动的通用视频预测模型](https://zhuanlan.zhihu.com/p/355703957)
\n视频预测方法被广泛应用于降水预报（Precipitation Nowcasting）、交通流预测（Traffic Flow Prediction）、机器人视觉规划（Visual Planning）等众多任务中。然而现实世界的运动极其复杂，且往往处于不断变化中，比如人体运动中的变向、变速、肢体运动，雷达回波中的云团产生、消散、位移、形变等等。这种复杂的时空变化使得准确预测未来的运动极具挑战性。
\n针对复杂时空运动，我们关注到现实世界的运动在时空上可以分解为整体运动趋势（motion trend）与瞬时变化（transient variation），并基于此提出了名为MotionRNN的模型，对运动趋势与瞬时变化进行统一建模。同时，作为一个通用的视频预测模型，MotionRNN具有很好的灵活性，可以结合众多的基于RNN的时空预测模型，稳定提升它们应对复杂时空运动的能力。
\n[paper](https://arxiv.org/abs/2103.02243)\n

\n\n【4】[Statistical Texture Learning](https://zhuanlan.zhihu.com/p/354501947)
\n从底层细节纹理分析与增强优化视觉学习问题，并在分割任务上得到了验证，直观、合理且有效涨点。我们从传统图像分析领域获得灵感，构建了这样一套Statistical Texture Learning框架，有效的在CNN架构中学习底层纹理（分析+增强），从而获得了非常有效的性能涨点。
\n[paper](https://arxiv.org/abs/2103.04133)\n

\n\n【3】[二次元妹子五官画风都能改，周博磊团队用无监督方法控制GAN(CVPR2021 Oral)](https://mp.weixin.qq.com/s/JmGCwi4QF5VVzujj7KIBwg)
\n现在，GAN不仅能画出二次元妹子，还能精准调节五官、表情、姿势和绘画风格。而且在调控某个因素的时候，其他条件能尽量保持不变。SeFa适用于PGGAN、StyleGAN、BigGAN和StyleGAN2等常见GAN模型，不仅对二次元妹子有效，甚至还能调控猫咪上下左右不同方向。
\n[paper](https://arxiv.org/abs/2007.06600) | [code](https://github.com/genforce/sefa) | [Colab](https://colab.research.google.com/github/genforce/sefa/blob/master/docs/SeFa.ipynb)\n

\n\n【2】[Inception convolution](https://zhuanlan.zhihu.com/p/354194188)
\n我们最近被CVPR2021接受的工作，主要使用一些优化手段来找到新的卷积模式，目标是能够找到一个部署友好简单的卷积来帮助下游各个任务更好的提升baseline。
\n[paper](https://arxiv.org/pdf/2012.13587.pdf) | [code](https://github.com/yifan123/IC-Conv)\n

\n\n【1】[RepVGG：极简架构，SOTA性能，让VGG式模型再次伟大（CVPR-2021）](https://zhuanlan.zhihu.com/p/344324470)
\n我们最近的工作RepVGG，用结构重参数化（structural re-parameterization）实现VGG式单路极简架构，一路3x3卷到底，在速度和性能上达到SOTA水平，在ImageNet上超过80%正确率。已经被CVPR-2021接收。不用NAS，不用attention，不用各种新颖的激活函数，甚至不用分支结构，只用3x3卷积和ReLU，也能达到SOTA性能。
\n[paper](https://arxiv.org/abs/2101.03697) | [开源预训练模型和代码（PyTorch版）](https://github.com/DingXiaoH/RepVGG) | [MegEngine版](https://github.com/megvii-model/RepVGG)\n\n\n
\n\n \n\n# 4. CVPR2021 Workshop\n\n[99] Learning from Incomplete Features by Simultaneous Training of Neural Networks and Sparse Coding(通过同时训练神经网络和稀疏编码从不完整特征中学习)
\n[paper](https://arxiv.org/abs/2011.14047)

\n\n[98] Anticipative Video Transformer(预期视频转换器)
\n[paper](https://arxiv.org/abs/2106.02036) | [project](http://facebookresearch.github.io/AVT)

\n\n[97] Occlusion Guided Scene Flow Estimation on 3D Point Clouds(3D 点云上的遮挡引导场景流估计)
\n[paper](https://arxiv.org/abs/2011.14880)

\n\n[96] ideo-based Person Re-identification without Bells and Whistles(基于 ideo 的人员重新识别)
\n[paper](https://arxiv.org/abs/2105.10678) | [code](https://github.com/jackie840129/CF-AAN)

\n\n[95] ADNet: Attention-guided Deformable Convolutional Network for High Dynamic Range Imaging(ADNet：用于高动态范围成像的注意力引导可变形卷积网络)
\n[paper](https://arxiv.org/abs/2105.10697)

\n\n[94] PAL: Intelligence Augmentation using Egocentric Visual Context Detection(PAL：使用以自我为中心的视觉上下文检测的智能增强)
\n[paper](https://arxiv.org/abs/2105.10735)

\n\n[93] PLM: Partial Label Masking for Imbalanced Multi-label Classification(PLM：不平衡多标签分类的部分标签屏蔽)
\n[paper](https://arxiv.org/abs/2105.10782)

\n\n[92] GOO: A Dataset for Gaze Object Prediction in Retail Environments(GOO：零售环境中注视对象预测的数据集)
\n[paper](https://arxiv.org/abs/2105.10793)

\n\n[91] Wisdom for the Crowd: Discoursive Power in Annotation Instructions for Computer Vision(大众的智慧：计算机视觉注释指令中的话语权)
\n[paper](https://arxiv.org/abs/2105.10990)

\n\n[90] High-level camera-LiDAR fusion for 3D object detection with machine learning(使用机器学习进行 3D 物体检测的高级相机-LiDAR 融合)
\n[paper](https://arxiv.org/abs/2105.11060)

\n\n[89] Real-time Monocular Depth Estimation with Sparse Supervision on Mobile(移动端稀疏监督下的实时单目深度估计)
\n[paper](https://arxiv.org/abs/2105.12053)

\n\n[88] RSCA: Real-time Segmentation-based Context-Aware Scene Text Detection(RSCA：基于实时分割的上下文感知场景文本检测)
\n[paper](https://arxiv.org/abs/2105.12789)

\n\n[87] Automatic Non-Linear Video Editing Transfer(自动非线性视频编辑传输)
\n[paper](https://arxiv.org/abs/2105.06988)
\n\n[86] City-Scale Multi-Camera Vehicle Tracking Guided by Crossroad Zones(十字路口区域引导的城市规模多摄像机车辆跟踪)
\n[paper](https://arxiv.org/abs/2105.06623) | [code](https://github.com/LCFractal/AIC21-MTMC)

\n\n[85] High-Resolution Complex Scene Synthesis with Transformers(Transformer的高分辨率复杂场景综合)
\n[paper](https://arxiv.org/abs/2105.06458)

\n\n[84] Deep Graphics Encoder for Real-Time Video Makeup Synthesis from Example(示例中的用于实时视频合成的Deep Graphics编码器)
\n[paper](https://arxiv.org/abs/2105.06407)

\n\n[83] Learning to Generate Novel Scene Compositions from Single Images and Videos(学习从单个图像和视频生成新颖的场景构图)
\n[paper](https://arxiv.org/abs/2105.05847)

\n\n[82] Directional GAN: A Novel Conditioning Strategy for Generative Networks(定向GAN：面向生成网络的新型调节策略)
\n[paper](https://arxiv.org/abs/2105.05712)

\n\n[81] EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration(ESPN：用于模糊图像恢复的增强型深金字塔网络)
\n[paper](https://arxiv.org/abs/2105.04872)

\n\n[80] ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research(ChaLearn LAP大规模签名人独立的隔离手语识别挑战：设计，结果和未来研究)
\n[paper](https://arxiv.org/abs/2105.05066)

\n\n[79] Rethinking of Radar's Role: A Camera-Radar Dataset and Systematic Annotator via Coordinate Alignment(重新考虑雷达的作用：通过坐标对齐的摄像机-雷达数据集和系统注释器)
\n[paper](https://arxiv.org/abs/2105.05207)

\n\n[78] Good Practices and A Strong Baseline for Traffic Anomaly Detection(【AI CITY第一名】良好做法和强大的交通异常检测基准)
\n[paper](https://arxiv.org/abs/2105.03827) | [code](https://github.com/Endeavour10020/AICity2021-Anomaly-Detection)
\n\n[77] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms(Dynamic-OFA：用于在异构嵌入式平台上进行性能扩展的运行时DNN架构切换)
\n[paper](https://arxiv.org/abs/2105.03596)

\n\n[76] The iWildCam 2021 Competition Dataset(iWildCam 2021竞赛数据集)
\n[paper](https://arxiv.org/abs/2105.03494)

\n\n[75] Pareto-Optimal Quantized ResNet Is Mostly 4-bit(帕累托最优量化ResNet主要为4位)
\n[paper](https://arxiv.org/abs/2105.03536) | [code](https://github.com/google-research/google-research/tree/master/aqt)

\n\n[74] Neural 3D Scene Compression via Model Compression(通过模型压缩进行神经3D场景压缩)
\n[paper](https://arxiv.org/abs/2105.03120)

\n\n[73] BasisNet: Two-stage Model Synthesis for Efficient Inference(BasisNet：有效推理的两阶段模型综合)
\n[paper](https://arxiv.org/abs/2105.03014)

\n\n[72] Effectively Leveraging Attributes for Visual Similarity(有效地利用属性实现视觉相似性)
\n[paper](https://arxiv.org/abs/2105.01695)

\n\n[71] Physically Inspired Dense Fusion Networks for Relighting(灵感来自于物理的密集融合网络)
\n[paper](https://arxiv.org/abs/2105.02209)

\n\n[70] Feedback control of event cameras(事件摄像机的反馈控制)
\n[paper](https://arxiv.org/abs/2105.00409)

\n\n[69] EQFace: A Simple Explicit Quality Network for Face Recognition(EQFace：用于面部识别的简单显式质量网络)
\n[paper](https://arxiv.org/abs/2105.00634) | [code](https://github.com/deepcam-cn/facequality)

\n\n[68] S3Net: A Single Stream Structure for Depth Guided Image Relighting(S3Net：用于深度引导图像重新照明的单一流结构)
\n[paper](https://arxiv.org/abs/2105.00681)

\n\n[67] Multi-modal Bifurcated Network for Depth Guided Image Relighting(用于深度引导图像重新照明的多模式分叉网络)
\n[paper](https://arxiv.org/abs/2105.00690)

\n\n[66] Renofeation: A Simple Transfer Learning Method for Improved Adversarial Robustness(Renofeation：一种简单的转移学习方法，以提高对抗性的鲁棒性)
\n[paper](https://arxiv.org/abs/2002.02998)

\n\n[65] D-LEMA: Deep Learning Ensembles from Multiple Annotations -- Application to Skin Lesion Segmentation(D-LEMA：来自多种注释的深度学习集合-在皮肤病变分割中的应用)
\n[paper](https://arxiv.org/abs/2012.07206)

\n\n[64] Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection(伪IoU：改进无锚对象检测中的标签分配)
\n[paper](https://arxiv.org/abs/2104.14082) | [code](https://github.com/SHI-Labs/Pseudo-IoU-for-Anchor-Free-Object-Detection)

\n\n[63] Cluster-driven Graph Federated Learning over Multiple Domains(跨域的集群驱动图联合学习)
\n[paper](https://arxiv.org/abs/2104.14628)

\n\n[62] Perceptual Image Quality Assessment with Transformers(Transformer的感知图像质量评估)
\n[paper](https://arxiv.org/abs/2104.14730) | [code](https://github.com/manricheon/IQT)

\n\n[61] CASSOD-Net: Cascaded and Separable Structures of Dilated Convolution for Embedded Vision Systems and Applications(CASSOD-Net：扩展的卷积的层叠和可分离结构，用于嵌入式视觉系统和应用)
\n[paper](https://arxiv.org/abs/2104.14126)

\n\n[60] NTIRE 2021 Challenge on Video Super-Resolution(NTIRE 2021视频超分辨率挑战)
\n[paper](https://arxiv.org/abs/2104.14852)

\n\n[59] NTIRE 2021 Challenge on Image Deblurring(NTIRE 2021图像去模糊挑战)
\n[paper](https://arxiv.org/abs/2104.14854)

\n\n[58] Differentiable Event Stream Simulator for Non-Rigid 3D Tracking(用于非刚性3D跟踪的可区分事件流模拟器)
\n[paper](https://arxiv.org/abs/2104.15139) ｜ [code](http://gvv.mpi-inf.mpg.de/projects/Event-based_Non-rigid_3D_Tracking)\n\n[57] Sign Segmentation with Changepoint-Modulated Pseudo-Labelling(具有Changepoint调制伪标签的符号分割)
\n[paper](https://arxiv.org/abs/2104.13817)

\n\n[56] Boosting Co-teaching with Compression Regularization for Label Noise(通过压缩正则化促进共教学以消除标签噪声)
\n[paper](https://arxiv.org/abs/2104.13766) | [project](https://github.com/yingyichen-cyy/Nested-Co-teaching)

\n\n[55] Towards Fair Federated Learning with Zero-Shot Data Augmentation(【数据增广】借助零散散数据增强实现公平的联合学习)
\n[paper](https://arxiv.org/abs/2104.13417)

\n\n[54] Unsupervised Detection of Cancerous Regions in Histology Imagery using Image-to-Image Translation(【图像翻译】【医学影像】使用图像到图像翻译的组织学图像中癌区域的无监督检测)
\n[paper](https://arxiv.org/abs/2104.13786)

\n\n[53] Width Transfer: On the (In)variance of Width Optimization(宽度传递：关于宽度优化的（输入）方差)
\n[paper](https://arxiv.org/abs/2104.13255)

\n\n[52] Three-stream network for enriched Action Recognition(【动作识别】三流网络，用于丰富动作识别)
\n[paper](https://arxiv.org/abs/2104.13051)

\n\n[51] Detecting and Matching Related Objects with One Proposal Multiple Predictions(用一个多预测提案检测和匹配相关对象)
\n[paper](https://arxiv.org/abs/2104.12574) | [code](https://github.com/foreverYoungGitHub/detect-and-match-related-objects)

\n\n[50] The 5th AI City Challenge(第五届AI城市挑战赛)
\n[paper](https://arxiv.org/abs/2104.12233)

\n\n[49] Do All MobileNets Quantize Poorly? Gaining Insights into the Effect of Quantization on Depthwise Separable Convolutional Networks Through the Eyes of Multi-scale Distributional Dynamics(【模型压缩】【移动端】所有MobileNets量化效果不佳吗？通过多尺度分布动力学了解量化对深度可分离卷积网络的影响)
\n[paper](https://arxiv.org/abs/2104.11849)

\n\n[48] Multi-Scale Hourglass Hierarchical Fusion Network for Single Image Deraining(【图像去雨】用于单图像去雨的多尺度沙漏分层融合网络)
\n[paper](https://arxiv.org/abs/2104.12100) | [code](https://github.com/cxtalk/MH2F-Net)

\n\n[47] Class-Incremental Experience Replay for Continual Learning under Concept Drift(【增量学习】在概念漂移下继续学习的班级增量体验重播)
\n[paper](https://arxiv.org/abs/2104.11861)

\n\n[46] SBNet: Segmentation-based Network for Natural Language-based Vehicle Search(SBNet：基于分段的自然语言车辆搜索网络)
\n[paper](https://arxiv.org/abs/2104.11589)

\n\n[45] Patch Shortcuts: Interpretable Proxy Models Efficiently Find Black-Box Vulnerabilities(补丁快捷方式：可解释的代理模型有效地发现黑盒漏洞)
\n[paper](https://arxiv.org/abs/2104.11691)

\n \n[44] Region-Adaptive Deformable Network for Image Quality Assessment(用于图像质量评价的区域自适应变形网络)
\n[paper](https://arxiv.org/abs/2104.11599) | [code](https://github.com/IIGROUP/RADN)

\n\n[43] Network Space Search for Pareto-Efficient Spaces(【深度学习训练】Pareto有效空间的网络空间搜索)
\n[paper](https://arxiv.org/abs/2104.11014)

\n\n[42] A Strong Baseline for Vehicle Re-Identification(【AI城市】车辆重新识别的强大基线)
\n[paper](https://arxiv.org/abs/2104.10850) | [code](https://github.com/cybercore-co-ltd/track2_aicity_2021)

\n\n[41] Multi-task Learning with Attention for End-to-end Autonomous Driving(【自动驾驶】端到端自主驾驶的多任务注意力学习)
\n[paper](https://arxiv.org/abs/2104.10753)

\n\n[40] Perceptual Loss for Robust Unsupervised Homography Estimation(鲁棒无监督单应估计的感知损失)
\n[paper](https://arxiv.org/abs/2104.10011)

\n\n[39] Table Tennis Stroke Recognition Using Two-Dimensional Human Pose Estimation(【人体姿态估计】基于二维人体姿态估计的乒乓球笔划识别)
\n[paper](https://arxiv.org/abs/2104.09907)

\n\n[38] Comparing Representations in Tracking for Event Camera-based SLAM(基于SLAM的事件摄像机跟踪中的比较表示)
\n[paper](https://arxiv.org/abs/2104.09887)

\n\n[37] Shadow Neural Radiance Fields for Multi-view Satellite Photogrammetry(【遥感图像】多视点卫星摄影测量中的阴影神经辐射场)
\n[paper](https://arxiv.org/abs/2104.09877)

\n\n[36] Distill on the Go: Online knowledge distillation in self-supervised learning(【知识蒸馏】在线提取：自我监督学习中的在线知识提取)
\n[paper](https://arxiv.org/abs/2104.09866)

\n\n[35] An Efficient Approach for Anomaly Detection in Traffic Videos(【视频异常检测】一种有效的交通视频异常检测方法)
\n[paper](https://arxiv.org/abs/2104.09758)

\n\n[34] Class-Incremental Learning with Generative Classifiers(【增量学习】基于生成分类器的课堂增量学习)
\n[paper](https://arxiv.org/abs/2104.10093)

\n\n[33] Engineering Sketch Generation for Computer-Aided Design(面向计算机辅助设计的工程草图生成)
\n[paper](https://arxiv.org/abs/2104.09621)

\n\n[32] IB-DRR: Incremental Learning with Information-Back Discrete Representation Replay(【增量学习】IB-DRR：基于信息反馈的增量学习)
\n[paper](https://arxiv.org/abs/2104.10588)

\n\n[31] Revisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental Analysis(【可解释性】重新审视类激活映射的可解释性评价：一个新的度量和实验分析)
\n[paper](https://arxiv.org/abs/2104.10252)

\n\n[30] GAN-Based Data Augmentation and Anonymization for Skin-Lesion Analysis: A Critical Review(基于GAN的皮肤损伤分析的数据增强和匿名化：一项重要的评论)
\n[paper](https://arxiv.org/abs/2104.10603)

\n\n[29] Brittle Features May Help Anomaly Detection(脆弱的功能可能有助于异常检测)
\n[paper](https://arxiv.org/abs/2104.10453)

\n\n[28] I Only Have Eyes for You: The Impact of Masks On Convolutional-Based Facial Expression Recognition(我只有你的眼睛：口罩对基于卷积的面部表情识别的影响)
\n[paper](https://arxiv.org/abs/2104.08353)

\n\n[27] Assessment of deep learning based blood pressure prediction from PPG and rPPG signals(从PPG和rPPG信号评估基于深度学习的血压预测)
\n[paper](https://arxiv.org/abs/2104.09313)

\n\n[26] A Two-branch Neural Network for Non-homogeneous Dehazing via Ensemble Learning(通过集合学习进行非均匀去雾的两分支神经网络)
\n[paper](https://arxiv.org/abs/2104.08902)

\n\n[25] End-to-End Interactive Prediction and Planning with Optical Flow Distillation for Autonomous Driving(用于自动驾驶的带有光流蒸馏的端到端交互式预测和计划)
\n[paper](https://arxiv.org/abs/2104.08862) | [project](http://sites.google.com/view/inmp-ofd)

\n\n[24] Reconsidering CO2 emissions from Computer Vision(考虑计算机视觉产生的二氧化碳排放量)
\n[paper](https://arxiv.org/abs/2104.08702)

\n\n[23] On Training Sketch Recognizers for New Domains(关于新领域的训练草图识别器)
\n[paper](https://arxiv.org/abs/2104.08850)

\n\n[22] Filtering Empty Camera Trap Images in Embedded Systems(过滤嵌入式系统中的空相机陷阱图像)
\n[paper](https://arxiv.org/abs/2104.08859)

\n\n[21] Contrastive Learning Improves Model Robustness Under Label Noise(对比学习提高了标签噪声下的模型鲁棒性)
\n[paper](https://arxiv.org/abs/2104.08984) | [code](https://github.com/arghosh/noisy_label_pretrain)

\n\n[20] Restoration of Video Frames from a Single Blurred Image with Motion Understanding(通过运动理解从单个模糊图像恢复视频帧)
\n[paper](https://arxiv.org/abs/2104.09134)

\n\n[19] LSPnet: A 2D Localization-oriented Spacecraft Pose Estimation Neural Network(LSPnet：面向二维本地化的航天器姿态估计神经网络)
\n[paper](https://arxiv.org/abs/2104.09248)

\n\n[18] Plants Don't Walk on the Street: Common-Sense Reasoning for Reliable Semantic Segmentation(植物不在大街上行走：可靠语义分割的常识推理)
\n[paper](https://arxiv.org/abs/2104.09254)

\n\n[17] Temporal Consistency Loss for High Resolution Textured and Clothed 3DHuman Reconstruction from Monocular Video(从单眼视频的高分辨率纹理化和布料化3D人体重建的时间一致性损失)
\n[paper](https://arxiv.org/abs/2104.09259)

\n\n[16] A Mathematical Analysis of Learning Loss for Active Learning in Regression(回归中主动学习的学习损失的数学分析)
\n[paper](https://arxiv.org/abs/2104.09315)

\n\n[15] Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting(SoccerNet-v2中的摄像机校准和球员本地化以及用于动作识别的研究)
\n[paper](https://arxiv.org/abs/2104.09333)

\n\n[14] DANICE: Domain adaptation without forgetting in neural image compression(DANICE：在不忘记神经图像压缩的情况下进行域自适应)
\n[paper](https://arxiv.org/abs/2104.09370)\n\n[13] OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas(OmniLayout：从室内球形全景图进行房间布局重建)
\n[paper](https://arxiv.org/abs/2104.09403) | [code](https://github.com/rshivansh/OmniLayout)

\n\n[12] Dual Contrastive Learning for Unsupervised Image-to-Image Translation(【图像翻译】双重对比学习，实现无监督的图像到图像翻译)
\n[paper](https://arxiv.org/abs/2104.07689) | [code](https://github.com/JunlinHan/DCLGAN)

\n\n[11] OmniFlow: Human Omnidirectional Optical Flow(OmniFlow：人类全向光流)
\n[paper](https://arxiv.org/abs/2104.07960)

\n\n[10] I Find Your Lack of Uncertainty in Computer Vision Disturbing(在计算机视觉干扰方面缺乏不确定性)
\n[paper](https://arxiv.org/abs/2104.08188)

\n\n[9] Fast Walsh-Hadamard Transform and Smooth-Thresholding Based Binary Layers in Deep Neural Networks(神经网络中快速Walsh-Hadamard变换和基于平滑阈值的二进制层)
\n[paper](https://arxiv.org/abs/2104.07085)

\n\n[8] Machine-learned 3D Building Vectorization from Satellite Imagery(【遥感图像】通过卫星图像进行机器学习的3D建筑矢量化)
\n[paper](https://arxiv.org/abs/2104.06485)

\n\n[7] Graph-based Person Signature for Person Re-Identifications(【行人重识别】用于行人重识别的基于图的人员签名)
\n[paper](https://arxiv.org/abs/2104.06770)

\n\n[6] Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts(【动作识别】用于足球广播中动作识别的临时感知功能池)
\n[paper](https://arxiv.org/abs/2104.06779)

\n\n[5] Continual learning in cross-modal retrieval(【持续学习】跨模式检索中的持续学习)
\n[paper](https://arxiv.org/abs/2104.06806)

\n\n[4] Towards Automated and Marker-less Parkinson Disease Assessment: Predicting UPDRS Scores using Sit-stand videos(迈向自动无标记帕金森病评估：使用站立式视频预测UPDRS得分)
\n[paper](https://arxiv.org/abs/2104.04650)

\n\n[3] Efficient Space-time Video Super Resolution using Low-Resolution Flow and Mask Upsampling(【图像超分】使用低分辨率流和遮罩上采样的高效时空视频超分辨率)
\n[paper](https://arxiv.org/abs/2104.05778) | [project](https://github.com/saikatdutta/FMU_STSR)

\n\n[2] Generalizable Multi-Camera 3D Pedestrian Detection(【行人检测】通用多摄像机3D行人检测)
\n[paper](https://arxiv.org/abs/2104.05813)

\n\n[1] Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation(【知识蒸馏】【视觉问答】通过知识蒸馏处理视觉问题回答差异预测任务中的缺失模态)
\n[paper](https://arxiv.org/abs/2104.05965)

\n\n
\n\n \n\n# 5. To do list\n\n* CVPR2021论文分享\n" }, { "path": "CVPR2022.md", "content": "* 推荐阅读：
\n * [ICCV2021/2019/2017 论文/代码/解读/直播合集](https://github.com/extreme-assistant/ICCV2021-Paper-Code-Interpretation)\n * [2020-2021年计算机视觉综述论文汇总](https://github.com/extreme-assistant/survey-computer-vision)\n * [国内外优秀的计算机视觉团队汇总](https://github.com/extreme-assistant/Awesome-CV-Team)\n------\n\n# CVPR2022最新信息及论文下载（Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等）\n\n官网链接：http://CVPR2022.thecvf.com
\n时间：2021年6月19日-6月24日
\n论文接收公布时间：2022年3月2日
\n\n相关问题：[如何评价 CVPR2022 的论文接收结果？](https://www.zhihu.com/question/519162597)
\n相关报道：[CVPR 2022 接收结果出炉！录用 2067 篇，接收数量上升24%](https://mp.weixin.qq.com/s/hAWrCpili4CICJzKrQ8Sog)
\n\n\n>update:
\n>2022/3/3 [更新 19 篇](https://bbs.cvmart.net/articles/6149)
\n>2022/3/4 [更新 29 篇](https://bbs.cvmart.net/articles/6151)
\n>2022/3/7 [更新 17 篇](https://bbs.cvmart.net/articles/6158)
\n>2022/3/9 [更新 57 篇](https://bbs.cvmart.net/articles/6164)
\n>2022/3/10 [更新 8 篇](https://bbs.cvmart.net/articles/6171)
\n>2022/3/11 [更新 18 篇](https://bbs.cvmart.net/articles/6173)
\n>2022/3/14 [更新 11 篇](https://bbs.cvmart.net/articles/6175)
\n>2022/3/15 [更新 30 篇](https://bbs.cvmart.net/articles/6178)
\n>2022/3/16 [更新 16 篇](https://bbs.cvmart.net/articles/6181)
\n>2022/3/17 [更新 24 篇](https://bbs.cvmart.net/articles/6185)
\n>2022/3/18 [更新 25 篇](https://bbs.cvmart.net/articles/6192)
\n>2022/3/22 [更新 52 篇](https://bbs.cvmart.net/articles/6204)
\n>2022/3/23 [更新 29 篇](https://bbs.cvmart.net/articles/6206)
\n>2022/3/24 [更新 22 篇](https://bbs.cvmart.net/articles/6209)
\n>2022/3/25 [更新 29 篇](https://bbs.cvmart.net/articles/6213)
\n>2022/3/28 [更新 29 篇](https://bbs.cvmart.net/articles/6222)
\n>2022/3/29 [更新 35 篇](https://bbs.cvmart.net/articles/6223)
\n>2022/3/30 [更新 28 篇](https://bbs.cvmart.net/articles/6231)
\n>2022/4/2 [更新 15 篇](https://bbs.cvmart.net/articles/6240)
\n>2022/4/6 [更新 26 篇](https://cvmart.net/community/detail/6250)
\n>2022/4/7 [更新 26 篇](https://cvmart.net/community/detail/6251)\n\n

\n\n# 目录\n\n[1. CVPR2022 接受论文/代码分方向汇总（更新中）](#1)
\n[2. CVPR2022 Oral（更新中）](#2)
\n[3. CVPR2022 论文解读汇总（更新中）](#3)
\n[4. CVPR2022 极市论文分享](#4)
\n[5. To do list](#5)
\n\n
\n\n \n\n# 1.CVPR2022接受论文/代码分方向整理(持续更新)\n\n\n## 分类目录：\n\n### [1. 检测](#detection)\n* [2D目标检测(2D Object Detection)](#IOD)\n* [视频目标检测(Video Object Detection)](#VOD)\n* [3D目标检测(3D Object Detection)](#3DOD)\n* [人物交互检测(HOI Detection)](#HOI)\n* [伪装目标检测(Camouflaged Object Detection)](#COD)\n* [旋转目标检测(Rotation Object Detection)](#ROD)\n* [显著性目标检测(Saliency Object Detection)](#SOD)\n* [关键点检测(Keypoint Detection)](#KeypointDetection)\n* [车道线检测(Lane Detection)](#LaneDetection)\n* [边缘检测(Edge Detection)](#EdgeDetection)\n* [消失点检测(Vanishing Point Detection)](#VPD)\n* [异常检测(Anomaly Detection)](#AnomalyDetection)\n\n### [2. 分割(Segmentation)](#Segmentation)\n* [图像分割(Image Segmentation)](#ImageSegmentation)\n* [全景分割(Panoptic Segmentation)](#PanopticSegmentation)\n* [语义分割(Semantic Segmentation)](#SemanticSegmentation)\n* [实例分割(Instance Segmentation)](#InstanceSegmentation)\n* [超像素(Superpixel)](#Superpixel)\n* [视频目标分割(Video Object Segmentation)](#VOS)\n* [抠图(Matting)](#Matting)\n* [密集预测(Dense Prediction)](#DensePrediction)\n\n### [3. 图像处理(Image Processing)](#ImageProcessing)\n\n* [超分辨率(Super Resolution)](#SuperResolution)\n* [图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)](#ImageRestoration)\n* [图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)](#ISR)\n* [图像去噪/去模糊/去雨去雾(Image Denoising)](#ImageDenoising)\n* [图像编辑/图像修复(Image Edit/Image Inpainting)](#ImageEdit)\n* [图像翻译(Image Translation)](#ImageTranslation)\n* [图像质量评估(Image Quality Assessment)](#IQA)\n* [风格迁移(Style Transfer)](#StyleTransfer)\n\n### [4. 视频处理(Video Processing)](#VideoProcessing)\n* [视频编辑(Video Editing)](#VideoEditing)\n* [视频生成/视频合成(Video Generation/Video Synthesis)](#VideoGeneration)\n* [视频超分(Video Super-Resolution)](#VideoSR)\n\n### [5. 估计(Estimation)](#Estimation)\n* [光流/运动估计(Flow/Motion Estimation)](#Flow/Pose/MotionEstimation)\n* [深度估计(Depth Estimation)](#DepthEstimation)\n* [人体解析/人体姿态估计(Human Parsing/Human Pose Estimation)](#HumanPoseEstimation)\n* [手势估计(Gesture Estimation)](#GestureEstimation)\n\n### [6. 图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)](#ImageRetrieval)\n* [行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)](#ActionRecognition)\n* [行人重识别/检测(Re-Identification/Detection)](#Re-Identification)\n* [图像/视频字幕(Image/Video Caption)](#VideoCaption)\n\n### [7. 人脸(Face)](#Face)\n* [人脸识别/检测(Facial Recognition/Detection)](#FacialRecognition)\n* [人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)](#FaceSynthesis)\n* [人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)](#FaceAnti-Spoofing)\n\n### [8. 三维视觉(3D Vision)](#3DVision)\n* [点云(Point Cloud)](#3DPC)\n* [三维重建(3D Reconstruction)](#3DReconstruction)\n* [场景重建/视图合成/新视角合成(Novel View Synthesis)](#NeRF)\n\n### [9. 目标跟踪(Object Tracking)](#ObjectTracking)\n\n### [10. 医学影像(Medical Imaging)](#MedicalImaging)\n\n### [11. 文本检测/识别/理解(Text Detection/Recognition/Understanding)](#TDR)\n\n### [12. 遥感图像(Remote Sensing Image)](#RSI)\n\n### [13. GAN/生成式/对抗式(GAN/Generative/Adversarial)](#GAN)\n\n### [14. 图像生成/图像合成(Image Generation/Image Synthesis)](#IGIS)\n\n### [15. 场景图(Scene Graph](#SG)\n* [场景图生成(Scene Graph Generation)](#SGG)\n* [场景图预测(Scene Graph Prediction)](#SGP)\n* [场景图理解(Scene Graph Understanding)](#SGU)\n\n### [16. 视觉定位/位姿估计(Visual Localization/Pose Estimation)](#VisualLocalization)\n\n### [17. 视觉推理/视觉问答(Visual Reasoning/VQA)](#VisualReasoning)\n\n### [18. 视觉预测(Vision-based Prediction)](#Vision-basedPrediction)\n\n### [19. 神经网络结构设计(Neural Network Structure Design)](#NNS)\n* [CNN](#CNN)\n* [Transformer](#Transformer)\n* [图神经网络(GNN)](#GNN)\n* [神经网络架构搜索(NAS)](#NAS)\n* [MLP](#MLP)\n\n### [20. 神经网络可解释性(Neural Network Interpretability)](#interpretability)\n\n### [21. 数据集(Dataset)](#Dataset)\n\n### [22. 数据处理(Data Processing)](#DataProcessing)\n* [数据增广(Data Augmentation)](#DataAugmentation)\n* [归一化/正则化(Batch Normalization)](#BatchNormalization)\n* [图像聚类(Image Clustering)](#ImageClustering)\n* [图像压缩(Image Compression)](#ImageCompression)\n\n### [23. 图像特征提取与匹配(Image feature extraction and matching)](#matching)\n\n### [24. 视觉表征学习(Visual Representation Learning)](#VisualRL)\n\n### [25. 模型训练/泛化(Model Training/Generalization)](#ModelTraining)\n* [噪声标签(Noisy Label)](#NoisyLabel)\n* [长尾分布(Long-Tailed Distribution)](#Long-Tailed)\n\n### [26. 模型压缩(Model Compression)](#ModelCompression)\n* [知识蒸馏(Knowledge Distillation)](#KnowledgeDistillation)\n* [剪枝(Pruning)](#Pruning)\n* [量化(Quantization)](#Quantization)\n\n### [27. 模型评估(Model Evaluation)](#ModelEvaluation)\n\n### [28. 图像分类(Image Classification)](#ImageClassification)\n\n### [29. 图像计数(Image Counting)](#CrowdCounting)\n\n### [30. 机器人(Robotic)](#Robotic)\n\n### [31. 半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)](#self-supervisedlearning)\n\n### [32. 多模态学习(Multi-Modal Learning)](#MMLearning)\n* [视听学习(Audio-visual Learning)](#Audio-VisualLearning)\n* [视觉-语言（Vision-language）](#VLRL)\n\n### [33. 主动学习(Active Learning)](#ActiveLearning)\n\n### [34. 小样本学习/零样本学习(Few-shot/Zero-shot Learning)](#Few-shotLearning)\n\n### [35. 持续学习(Continual Learning/Life-long Learning)](#ContinualLearning)\n\n### [36. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)](#domain)\n\n### [37. 度量学习(Metric Learning)](#MetricLearning)\n\n### [38. 对比学习(Contrastive Learning)](#ContrastiveLearning)\n\n### [39. 增量学习(Incremental Learning)](#IncrementalLearning)\n\n### [40. 强化学习(Reinforcement Learning)](#RL)\n\n### [41. 元学习(Meta Learning)](#MetaLearning)\n\n### [42. 联邦学习(Federated Learning](#federatedlearning)\n\n\n\n\n\n\n### [其他](#100)\n\n\n\n

\n\n \n\n## 检测\n\n\n\n
\n\n \n\n### 2D目标检测(2D Object Detection)\n\n[21] OW-DETR: Open-world Detection Transformer(开放世界检测transformer)
\n[paper](https://arxiv.org/abs/2112.01513) | [code](https://github.com/akshitac8/OW-DETR)

\n\n[20] Overcoming Catastrophic Forgetting in Incremental Object Detection via Elastic Response Distillation(通过弹性响应蒸馏克服增量目标检测中的灾难性遗忘)
\n[paper](https://arxiv.org/abs/2204.02136) | [code](https://github.com/Hi-FT/ERD)

\n\n[19] AdaMixer: A Fast-Converging Query-Based Object Detector(一种快速收敛的基于查询的对象检测器)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.16507) | [code](https://github.com/MCG-NJU/AdaMixer)

\n\n[18] Multi-Granularity Alignment Domain Adaptation for Object Detection(用于目标检测的多粒度对齐域自适应)
\n[paper](https://arxiv.org/abs/2203.16897) | [code](https://github.com/tiankongzhang/MGADA)

\n\n[17] Interactron: Embodied Adaptive Object Detection(体现自适应目标检测)
\n[paper](https://arxiv.org/abs/2202.00660) | [code](https://github.com/allenai/interactron)

\n\n[16] Label, Verify, Correct: A Simple Few Shot Object Detection Method(标签、验证、正确：一种简单的小样本物体检测方法)
\n[paper](https://arxiv.org/abs/2112.05749) | [project](https://www.robots.ox.ac.uk/~vgg/research/lvc/)

\n\n[15] Sylph: A Hypernetwork Framework for Incremental Few-shot Object Detection(用于增量少样本目标检测的超网络框架)
\n[paper](https://arxiv.org/abs/2203.13903)

\n\n[14] QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection(用于加速高分辨率小目标检测的级联稀疏查询)
\n[paper](https://arxiv.org/abs/2103.09136) | [code](https://github.com/ChenhongyiYang/QueryDet-PyTorch)

\n\n[13] End-to-End Human-Gaze-Target Detection with Transformers(使用 Transformer 进行端到端的人眼目标检测)
\n[paper](https://arxiv.org/abs/2203.10433)

\n\n[12] Progressive End-to-End Object Detection in Crowded Scenes(拥挤场景中的渐进式端到端对象检测)
\n[paper](https://arxiv.org/abs/2203.07669) | [code](https://github.com/megvii-model/Iter-E2EDET)

\n\n[11] Real-time Object Detection for Streaming Perception(用于流感知的实时对象检测)
\n[paper](https://arxiv.org/abs/2203.12338) | [code](https://github.com/yancie-yjr/StreamYOLO)

\n\n[10] Oriented RepPoints for Aerial Object Detection(面向空中目标检测的 RepPoints)(**小目标检测**)
\n[paper](https://arxiv.org/abs/2105.11111) | [code](https://github.com/LiWentomng/OrientedRepPoints)

\n\n[9] Confidence Propagation Cluster: Unleash Full Potential of Object Detectors(信心传播集群：释放物体检测器的全部潜力)
\n[paper](https://arxiv.org/abs/2112.00342)

\n\n[8] Semantic-aligned Fusion Transformer for One-shot Object Detection(用于一次性目标检测的语义对齐融合转换器)
\n[paper](https://arxiv.org/abs/2203.09093)

\n\n[7] A Dual Weighting Label Assignment Scheme for Object Detection(一种用于目标检测的双重加权标签分配方案)
\n[paper](https://arxiv.org/abs/2203.09730) | [code](https://github.com/strongwolf/DW)

\n\n[6] MUM : Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection(混合图像块和 UnMix 特征块用于半监督目标检测)
\n[paper](https://arxiv.org/abs/2111.10958) | [code](https://github.com/JongMokKim/mix-unmix)

\n\n[5] SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection(域自适应对象检测的语义完全图匹配)
\n[paper](https://arxiv.org/abs/2203.06398) | [code](https://github.com/CityU-AIM-Group/SIGMA)

\n\n[4] Accelerating DETR Convergence via Semantic-Aligned Matching(通过语义对齐匹配加速 DETR 收敛)
\n[paper](https://arxiv.org/abs/2203.06883) | [code](https://github.com/ZhangGongjie/SAM-DETR)

\n\n[3] Focal and Global Knowledge Distillation for Detectors(探测器的焦点和全局知识蒸馏)
\nkeywords: Object Detection, Knowledge Distillation
\n[paper](https://arxiv.org/abs/2111.11837) | [code](https://github.com/yzd-v/FGD)

\n\n[2] Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild(未知感知对象检测：从野外视频中学习你不知道的东西)
\n[paper](https://arxiv.org/abs/2203.03800) | [code](https://github.com/deeplearning-wisc/stud)

\n\n[1] Localization Distillation for Dense Object Detection(密集对象检测的定位蒸馏)
\nkeywords: Bounding Box Regression, Localization Quality Estimation, Knowledge Distillation
\n[paper](https://arxiv.org/abs/2102.12252) | [code](https://github.com/HikariTJU/LD)
\n解读：[南开程明明团队和天大提出LD：目标检测的定位蒸馏](https://zhuanlan.zhihu.com/p/474955539)

\n\n
\n\n\n \n\n### 视频目标检测(Video Object Detection)\n\n[1] Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering(通过联合表示学习和在线聚类进行无监督活动分割)
\n[paper](https://arxiv.org/abs/2105.13353) | [video](https://www.youtube.com/watch?v=i4Fh_3nzzUI&t=12s)

\n\n
\n\n \n\n### 3D目标检测(3D object detection)\n\n[21] CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection(用于多模态 3D 对象检测的对比增强transformer)
\n[paper](https://arxiv.org/abs/2204.00325)

\n\n[20] Forecasting from LiDAR via Future Object Detection(通过未来目标检测从 LiDAR 进行预测)
\n[paper](https://arxiv.org/abs/2203.16297) | [code](https://github.com/neeharperi/FutureDet)

\n\n[15] Point2Seq: Detecting 3D Objects as Sequences(将 3D 对象检测为序列)
\n[paper](https://arxiv.org/abs/2203.13394) | [code](https://github.com/ocNflag/point2seq)

\n\n[14] MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection(用于单目 3D 对象检测的深度感知transformer)
\n[paper](https://arxiv.org/abs/2203.13310) | [code](https://github.com/ZrrSkywalker/MonoDETR.git)

\n\n[13] TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers(用于 3D 对象检测的稳健 LiDAR-Camera Fusion 与 Transformer)
\n[paper](https://arxiv.org/abs/2203.11496) | [code](https://github.com/XuyangBai/TransFusion)

\n\n[12] Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds(学习用于 3D LiDAR 点云的高效基于点的检测器)
\n[paper](https://arxiv.org/abs/2203.11139) | [code](https://github.com/yifanzhang713/IA-SSD)

\n\n[11] Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion(迈向具有深度完成的高质量 3D 检测)
\n[paper](https://arxiv.org/abs/2203.09780)

\n\n[10] MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer(使用深度感知 Transformer 的单目 3D 对象检测)
\n[paper](https://arxiv.org/abs/2203.10981) | [code](https://github.com/kuanchihhuang/MonoDTR)

\n\n[9] Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds(从点云进行 3D 对象检测的 Set-to-Set 方法)
\n[paper](https://arxiv.org/abs/2203.10314) | [code](https://github.com/skyhehe123/VoxSeT)

\n\n[8] VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention
\n[paper](https://arxiv.org/abs/2203.09704) | [code](https://github.com/Gorilla-Lab-SCUT/VISTA)

\n\n[7] MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection(单目 3D 目标检测的联合语义和几何成本量)
\n[paper](https://arxiv.org/abs/2203.08563) | [code](https://github.com/lianqing11/MonoJSG)

\n\n[6] DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection(用于多模态 3D 目标检测的激光雷达相机深度融合)
\n[paper](https://arxiv.org/abs/2203.08195) | [code](https://github.com/tensorflow/lingvo/tree/master/lingvo/)

\n\n[5] Point Density-Aware Voxels for LiDAR 3D Object Detection(用于 LiDAR 3D 对象检测的点密度感知体素)
\n[paper](https://arxiv.org/abs/2203.05662) | [code](https://github.com/TRAILab/PDV)

\n\n[4] Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement(带有形状引导标签增强的弱监督 3D 对象检测)
\n[paper](https://arxiv.org/abs/2203.05238) | [code](https://github.com/xuxw98/BackToReality)

\n\n[3] Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes(在 3D 场景中实现稳健的定向边界框检测)
\n[paper](https://arxiv.org/abs/2011.12001) | [code](https://github.com/qq456cvb/CanonicalVoting)

\n\n[2] A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation(在全景分割的指导下，用于基于 LiDAR 的 3D 对象检测的多功能多视图框架)
\nkeywords: 3D Object Detection with Point-based Methods, 3D Object Detection with Grid-based Methods, Cluster-free 3D Panoptic Segmentation, CenterPoint 3D Object Detection
\n[paper](https://arxiv.org/abs/2203.02133)

\n\n[1] Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving(自动驾驶中用于单目 3D 目标检测的伪立体)
\nkeywords: Autonomous Driving, Monocular 3D Object Detection
\n[paper](https://arxiv.org/abs/2203.02112) | [code](https://github.com/revisitq/Pseudo-Stereo-3D)

\n\n
\n\n \n\n### 人物交互检测(HOI Detection)\n\n[3] What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions(看什么和在哪里看：语义和空间精炼transformer，用于检测人与物体的交互)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.00746)

\n\n[2] MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection(用于端到端人-物交互检测的多尺度 Transformer)
\n[paper](https://arxiv.org/abs/2203.14709)

\n\n[1] Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer(使用新型一元对变换器的人与物体交互的两阶段检测)
\n[paper](https://arxiv.org/abs/2112.01838) | [project](https://fredzzhang.com/unary-pairwise-transformers)

\n\n
\n\n \n\n### 伪装目标检测(Camouflaged Object Detection)\n\n[2] Implicit Motion Handling for Video Camouflaged Object Detection(视频伪装对象检测的隐式运动处理)
\n[paper](https://arxiv.org/abs/2203.07363) | [dataset](https://xueliancheng.github.io/SLT-Net-project)

\n\n[1] Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection(放大和缩小：用于伪装目标检测的混合尺度三元组网络)
\n[paper](https://arxiv.org/abs/2203.02688) | [code](https://github.com/lartpang/ZoomNet)

\n\n
\n\n \n\n### 旋转目标检测(Rotation Object Detection)\n\n
\n\n \n\n### 显著性目标检测(Saliency Object Detection)\n\n[2] Bi-directional Object-context Prioritization Learning for Saliency Ranking(显着性排名的双向对象上下文优先级学习)
\n[paper](https://arxiv.org/abs/2203.09416) | [code](https://github.com/GrassBro/OCOR)

\n\n[1] Democracy Does Matter: Comprehensive Feature Mining for Co-Salient Object Detection()
\n[paper](https://arxiv.org/abs/2203.05787)

\n\n\n
\n\n \n\n### 关键点检测(Keypoint Detection)\n\n[1] UKPGAN: A General Self-Supervised Keypoint Detector(一个通用的自监督关键点检测器)
\n[paper](https://arxiv.org/abs/2011.11974) | [code](https://github.com/qq456cvb/UKPGAN)

\n\n
\n\n \n\n### 车道线检测(Lane Detection)\n\n[2] CLRNet: Cross Layer Refinement Network for Lane Detection(用于车道检测的跨层细化网络)
\n[paper](https://arxiv.org/abs/2203.10350)

\n\n[1] Rethinking Efficient Lane Detection via Curve Modeling(通过曲线建模重新思考高效车道检测)
\nkeywords: Segmentation-based Lane Detection, Point Detection-based Lane Detection, Curve-based Lane Detection, autonomous driving
\n[paper](https://arxiv.org/abs/2203.02431) | [code](https://github.com/voldemortX/pytorch-auto-drive)

\n\n
\n\n \n\n### 边缘检测(Edge Detection)\n\n[1] EDTER: Edge Detection with Transformer(使用transformer的边缘检测)
\n[paper](https://arxiv.org/abs/2203.08566) | [code](https://github.com/MengyangPu/EDTER)

\n\n
\n\n \n\n### 消失点检测(Vanishing Point Detection)\n\n[1] Deep vanishing point detection: Geometric priors make dataset variations vanish(深度**消失点检测**：几何先验使数据集变化消失)
\n[paper](https://arxiv.org/abs/2203.08586) | [code](https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere)

\n\n
\n\n \n\n### 异常检测(Anomaly Detection)\n\n[5] Catching Both Gray and Black Swans: Open-set Supervised Anomaly Detection(捕捉灰天鹅和黑天鹅：开放集监督异常检测)
\n[paper](https://arxiv.org/abs/2203.14506) | [code](https://github.com/choubo/DRA)

\n\n[4] UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection(监督开放集视频异常检测的新基准)
\n[paper](https://arxiv.org/abs/2111.08644) | [code](https://github.com/lilygeorgescu/UBnormal)

\n\n[3] ViM: Out-Of-Distribution with Virtual-logit Matching(具有虚拟 logit 匹配的分布外)(**OOD检测**)
\n[paper](https://arxiv.org/abs/2203.10807) | [code](https://github.com/haoqiwang/vim)

\n\n[2] Generative Cooperative Learning for Unsupervised Video Anomaly Detection(用于无监督视频异常检测的生成式协作学习)
\n[paper](https://arxiv.org/abs/2203.03962)

\n\n[1] Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection(用于异常检测的自监督预测卷积注意力块)(论文暂未上传)
\n[paper](https://arxiv.org/abs/2111.09099) | [code](https://github.com/ristea/sspcab)

\n\n\n
\n\n \n\n\n## 分割(Segmentation)\n\n
\n\n \n\n### 图像分割(Image Segmentation)\n\n[6] Learning Part Segmentation through Unsupervised Domain Adaptation from Synthetic Vehicles(通过合成车辆的无监督域适应学习零件分割)(**Oral**)
\n[paper](https://arxiv.org/abs/2103.14098) | [dataset](https://qliu24.github.io/udapart)

\n\n[5] Progressive Minimal Path Method with Embedded CNN(具有嵌入式 CNN 的渐进最小路径方法)
\n[paper](https://arxiv.org/abs/2204.00944)

\n\n[4] Revisiting Near/Remote Sensing with Geospatial Attention(用地理空间注意力重新审视近/遥感)
\n[paper](https://arxiv.org/abs/2204.01807)

\n\n[3] Learning What Not to Segment: A New Perspective on Few-Shot Segmentation(学习不分割的内容：关于小样本分割的新视角)
\n[paper](https://arxiv.org/abs/2203.07615) | [code](http://github.com/chunbolang/BAM)

\n\n[2] CRIS: CLIP-Driven Referring Image Segmentation(CLIP 驱动的参考图像分割)
\n[paper](https://arxiv.org/abs/2111.15174)

\n\n[1] Hyperbolic Image Segmentation(双曲线图像分割)
\n[paper](https://arxiv.org/abs/2203.05898)

\n\n
\n\n \n\n### 全景分割(Panoptic Segmentation)\n\n[2] Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers(使用 Transformers 深入研究全景分割)
\n[paper](https://arxiv.org/abs/2109.03814) | [code](https://github.com/zhiqi-li/Panoptic-SegFormer)

\n\n[1] Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation(弯曲现实：适应全景语义分割的失真感知Transformer)
\nkeywords: Semantic- and panoramic segmentation, Unsupervised domain adaptation, Transformer
\n[paper](https://arxiv.org/abs/2203.01452) | [code](https://github.com/jamycheung/Trans4PASS)

\n\n
\n\n \n\n### 语义分割(Semantic Segmentation)\n\n[22] Semantic-Aware Domain Generalized Segmentation(语义感知领域广义分割)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.00822) | [code](https://github.com/leolyj/SAN-SAW)

\n\n[21] FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation(学习雾景分割的雾不变特征)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.01587) | [project](http://cvlab.postech.ac.kr/research/FIFO/)

\n\n[20] WildNet: Learning Domain Generalized Semantic Segmentation from the Wild(从野外学习领域广义语义分割)
\n[paper](https://arxiv.org/abs/2204.01446) | [code](https://github.com/suhyeonlee/WildNet)

\n\n[19] Rethinking Semantic Segmentation: A Prototype View(重新思考语义分割：原型视图)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.15102) | [code](https://github.com/tfzhou/ProtoSeg)

\n\n[18] DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation(改进域自适应语义分割的网络架构和训练策略)
\n[paper](https://arxiv.org/abs/2111.14887) | [code](https://github.com/lhoyer/DAFormer)

\n\n[17] Towards Fewer Annotations: Active Learning via Region Impurity and Prediction Uncertainty for Domain Adaptive Semantic Segmentation(朝向更少的注释：通过区域不纯度和预测不确定性进行域自适应语义分割的主动学习)
\n[paper](https://arxiv.org/abs/2111.12940) | [code](https://github.com/BIT-DA/RIPU)

\n\n[16] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation(半监督语义分割的扰动和严格均值教师)
\n[paper](https://arxiv.org/abs/2111.12903)

\n\n[15] Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation(用于域自适应语义分割的类平衡像素级自标记)
\n[paper](https://arxiv.org/abs/2203.09744) | [code](https://github.com/lslrh/CPSL)

\n\n[14] Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation(弱监督语义分割的区域语义对比和聚合)
\n[paper](https://arxiv.org/abs/2203.09653) | [code](https://github.com/maeve07/RCA.git)

\n\n[13] Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation(走向稀疏注释的语义分割)
\n[paper](https://arxiv.org/abs/2203.10739) | [code](https://github.com/megviiresearch/TEL)

\n\n[12] Scribble-Supervised LiDAR Semantic Segmentation
\n[paper](https://arxiv.org/abs/2203.08537) |[code](http://github.com/ouenal/scribblekitti)

\n\n[11] ADAS: A Direct Adaptation Strategy for Multi-Target Domain Adaptive Semantic Segmentation(多目标域自适应语义分割的直接适应策略)
\n[paper](https://arxiv.org/abs/2203.06811)

\n\n[10] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast(通过像素到原型对比的弱监督语义分割)
\n[paper](https://arxiv.org/abs/2110.07110)

\n\n[9] Representation Compensation Networks for Continual Semantic Segmentation(连续语义分割的表示补偿网络)
\n[paper](https://arxiv.org/abs/2203.05402) | [code](https://github.com/zhangchbin/RCIL)

\n\n[8] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels(使用不可靠伪标签的半监督语义分割)
\n[paper](https://arxiv.org/abs/2203.03884) | [code](https://github.com/Haochen-Wang409/U2PL/) | [project](https://haochen-wang409.github.io/U2PL/)

\n\n[7] Weakly Supervised Semantic Segmentation using Out-of-Distribution Data(使用分布外数据的弱监督语义分割)
\n[paper](https://arxiv.org/abs/2203.03860) | [code](https://github.com/naver-ai/w-ood)

\n\n[6] Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation(弱监督语义分割的自监督图像特定原型探索)
\n[paper](https://arxiv.org/abs/2203.02909) | [code](https://github.com/chenqi1126/SIPE)

\n\n[5] Multi-class Token Transformer for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的多类token Transformer)
\n[paper](https://arxiv.org/abs/2203.02891) | [code](https://github.com/xulianuwa/MCTformer)

\n\n[4] Cross Language Image Matching for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的跨语言图像匹配)
\n[paper](https://arxiv.org/abs/2203.02668)

\n\n[3] Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers(从注意力中学习亲和力：使用 Transformers 的端到端弱监督语义分割)
\n[paper](https://arxiv.org/abs/2203.02664) | [code](https://github.com/rulixiang/afa)

\n\n[2] ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation(让自我训练更好地用于半监督语义分割)
\nkeywords: Semi-supervised learning, Semantic segmentation, Uncertainty estimation
\n[paper](https://arxiv.org/abs/2106.05095) | [code](https://github.com/LiheYoung/ST-PlusPlus)

\n\n[1] Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation(弱监督语义分割的类重新激活图)
\n[paper](https://arxiv.org/pdf/2203.00962.pdf) | [code](https://github.com/zhaozhengChen/ReCAM)

\n\n
\n\n \n\n### 实例分割(Instance Segmentation)\n\n[12] Sparse Object-level Supervision for Instance Segmentation with Pixel Embeddings(具有像素嵌入的实例分割的稀疏对象级监督)
\n[paper](https://arxiv.org/abs/2103.14572) | [code](https://github.com/kreshuklab/spoco)

\n\n[11] Relieving Long-tailed Instance Segmentation via Pairwise Class Balance(通过 Pairwise Class Balance 减轻长尾实例分割)
\n[paper](https://arxiv.org/abs/2201.02784) | [code](https://github.com/megvii-research/PCB)

\n\n[10] Beyond Semantic to Instance Segmentation: Weakly-Supervised Instance Segmentation via Semantic Knowledge Transfer and Self-Refinement(超越语义到实例分割：通过语义知识转移和自我完善的弱监督实例分割)
\n[paper](https://arxiv.org/abs/2109.09477) | [code](https://github.com/clovaai/BESTIE)

\n\n[9] Noisy Boundaries: Lemon or Lemonade for Semi-supervised Instance Segmentation?(嘈杂的边界：半监督实例分割的柠檬还是柠檬水？)
\n[paper](https://arxiv.org/abs/2203.13427)

\n\n[8] SharpContour: A Contour-based Boundary Refinement Approach for Efficient and Accurate Instance Segmentation(一种用于高效准确实例分割的基于轮廓的边界细化方法)
\n[paper](https://arxiv.org/abs/2203.13312) | [project](https://xyzhang17.github.io/SharpContour/)

\n\n[7] Sparse Instance Activation for Real-Time Instance Segmentation(实时实例分割的稀疏实例激活)
\n[paper](https://arxiv.org/abs/2203.12827) | [code](https://github.com/hustvl/SparseInst)

\n\n[6] Mask Transfiner for High-Quality Instance Segmentation(用于高质量实例分割的 Mask Transfiner)
\n[paper](https://arxiv.org/abs/2111.13673) | [code](https://github.com/SysCV/transfiner)

\n\n[5] ContrastMask: Contrastive Learning to Segment Every Thing(对比学习分割每件事)
\n[paper](https://arxiv.org/abs/2203.09775)

\n\n[4] Discovering Objects that Can Move(发现可以移动的物体)
\n[paper](https://arxiv.org/abs/2203.10159) | [code](https://github.com/zpbao/Discovery_Obj_Move/)

\n\n[3] E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation(一种基于端到端轮廓的高质量高速实例分割方法)
\n[paper](https://arxiv.org/abs/2203.04074) | [code](https://github.com/zhang-tao-whu/e2ec)

\n\n[2] Efficient Video Instance Segmentation via Tracklet Query and Proposal(通过 Tracklet Query 和 Proposal 进行高效的视频实例分割)
\n[paper](https://arxiv.org/abs/2203.01853)

\n\n[1] SoftGroup for 3D Instance Segmentation on Point Clouds(用于点云上的 3D 实例分割)
\nkeywords: 3D Vision, Point Clouds, Instance Segmentation
\n[paper](https://arxiv.org/abs/2203.01509) | [code](https://github.com/thangvubk/SoftGroup.git)

\n\n
\n\n \n\n### 超像素(Superpixel)\n\n
\n\n \n\n### 视频目标分割(Video Object Segmentation)\n\n[1] Language as Queries for Referring Video Object Segmentation(语言作为引用视频对象分割的查询)
\n[paper](https://arxiv.org/abs/2201.00487) | [code](https://github.com/wjn922/ReferFormer)

\n\n
\n\n \n\n### 抠图(Matting)\n\n
\n\n \n\n### 密集预测(Dense Prediction)\n\n[1] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting(具有上下文感知提示的语言引导密集预测)
\n[paper](https://arxiv.org/abs/2112.01518) | [code](https://github.com/raoyongming/DenseCLIP)

\n\n
\n\n \n\n## 视频处理(Video Processing)\n\n[5] Bringing Old Films Back to Life(让老电影焕然一新)
\n[paper](https://arxiv.org/abs/2203.17276) | [code](https://github.com/raywzy/Bringing-Old-Films-Back-to-Life)

\n\n[4] Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion(具有参数非线性流和多尺度融合的基于事件的帧插值)
\n[paper](https://arxiv.org/abs/2203.17191) | [project](https://uzh-rpg.github.io/timelens-pp/) | [video](https://youtu.be/AsRKQRWHbVs) | [dataset](https://github.com/uzh-rpg/timelens-pp/)

\n\n[3] Long-term Video Frame Interpolation via Feature Propagation(通过特征传播的长期视频帧插值)
\n[paper](https://arxiv.org/abs/2203.15427)

\n\n[2] Unifying Motion Deblurring and Frame Interpolation with Events(将运动去模糊和帧插值与事件统一起来)
\n[paper](https://arxiv.org/abs/2203.12178)

\n\n[1] Neural Compression-Based Feature Learning for Video Restoration(用于视频复原的基于神经压缩的特征学习)
\n[paper](https://arxiv.org/abs/2203.09208)

\n\n
\n\n \n\n### 视频编辑(Video Editing)\n\n[1] M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers(M3L：通过多模式多级transformer进行基于语言的视频编辑)
\n[paper](https://arxiv.org/abs/2104.01122)

\n\n
\n\n \n\n### 视频生成/视频合成(Video Generation/Video Synthesis)\n\n[2] Depth-Aware Generative Adversarial Network for Talking Head Video Generation(用于说话头视频生成的深度感知生成对抗网络)
\n[paper](https://arxiv.org/abs/2203.06605) | [code](https://github.com/harlanhong/CVPR2022-DaGAN)

\n\n[1] Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning(告诉我什么并告诉我如何：通过多模式调节进行视频合成)
\n[paper](https://arxiv.org/abs/2203.02573) | [code](https://github.com/snap-research/MMVID)

\n\n
\n\n \n\n### 视频超分(Video Super-Resolution)\n\n[1] Reference-based Video Super-Resolution Using Multi-Camera Video Triplets(使用多摄像机视频三元组的基于参考的视频超分辨率)
\n[paper](https://arxiv.org/abs/2203.14537) | [code](https://github.com/codeslake/RefVSR)

\n\n
\n\n \n\n## 估计(Estimation)\n\n\n
\n\n \n\n### 光流/运动估计(Optical Flow/Motion Estimation)\n\n[2] Global Matching with Overlapping Attention for Optical Flow Estimation(具有重叠注意力的全局匹配光流估计)
\n[paper](https://arxiv.org/abs/2203.11335) | [code](https://github.com/xiaofeng94/GMFlowNet)

\n\n[1] CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation(用于联合光流和场景流估计的双向相机-LiDAR 融合)
\n[paper](https://arxiv.org/abs/2111.10502)

\n\n
\n\n \n\n### 深度估计(Depth Estimation)\n\n[17] Degradation-agnostic Correspondence from Resolution-asymmetric Stereo(来自分辨率非对称立体声的与退化无关的对应)
\n[paper](https://arxiv.org/abs/2204.01429)

\n\n[16] P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior(具有分段平面先验的单目深度估计)
\n[paper](https://arxiv.org/abs/2204.02091) | [code](https://github.com/SysCV/P3Depth)

\n\n[15] Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry(通过融合单视图深度概率与多视图几何进行多视图深度估计)(**Oral**)
\n[paper](https://arxiv.org/abs/2112.08177) | [code](https://github.com/baegwangbin/MaGNet)

\n\n[14] Learning Structured Gaussians to Approximate Deep Ensembles(学习结构化高斯函数以逼近深度集成)
\n[paper](https://arxiv.org/abs/2203.15485)

\n\n[13] LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network(具有几何感知变压器网络的室内全景房间布局估计)(布局估计)
\n[paper](https://arxiv.org/abs/2203.01824) | [code](https://github.com/zhigangjiang/LGT-Net)

\n\n[12] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation(基于自适应相关的级联循环网络的实用立体匹配)
\n[paper](https://arxiv.org/abs/2203.11483) | [project](https://github.com/megvii-research/CREStereo)

\n\n[11] Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light(结合双目立体和单目结构光的深度估计)
\n[paper](https://arxiv.org/abs/2203.10493) | [code](https://github.com/YuhuaXu/MonoStereoFusion)

\n\n[10] RGB-Depth Fusion GAN for Indoor Depth Completion(用于室内深度完成的 RGB 深度融合 GAN)
\n[paper](https://arxiv.org/abs/2203.10856)

\n\n[9] Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective(从特征一致性的角度重新审视域广义立体匹配网络)
\n[paper](https://arxiv.org/abs/2203.10887)

\n\n[8] Deep Depth from Focus with Differential Focus Volume(具有不同焦点体积的焦点深度)
\n[paper](https://arxiv.org/abs/2112.01712)

\n\n[7] ChiTransformer:Towards Reliable Stereo from Cues(从线索走向可靠的立体声)
\n[paper](https://arxiv.org/abs/2203.04554)

\n\n[6] Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation and Focal Loss(重新思考多视图立体的深度估计：统一表示和焦点损失)
\n[paper](https://arxiv.org/abs/2201.01501) | [code](https://github.com/prstrive/UniMVSNet)

\n\n[5] ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks(立体匹配网络中自动避免捷径和域泛化的信息论方法)
\nkeywords: Learning-based Stereo Matching Networks, Single Domain Generalization, Shortcut Learning
\n[paper](https://arxiv.org/pdf/2201.02263.pdf)

\n\n[4] Attention Concatenation Volume for Accurate and Efficient Stereo Matching(用于精确和高效立体匹配的注意力连接体积)
\nkeywords: Stereo Matching, cost volume construction, cost aggregation
\n[paper](https://arxiv.org/pdf/2203.02146.pdf) | [code](https://github.com/gangweiX/ACVNet)

\n\n[3] Occlusion-Aware Cost Constructor for Light Field Depth Estimation(光场深度估计的遮挡感知成本构造函数)
\n[paper](https://arxiv.org/pdf/2203.01576.pdf) | [code](https://github.com/YingqianWang/OACC- Net)

\n\n[2] NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation(用于单目深度估计的神经窗口全连接 CRF)
\nkeywords: Neural CRFs for Monocular Depth
\n[paper](https://arxiv.org/pdf/2203.01502.pdf)

\n\n[1] OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion(通过几何感知融合进行 360 度单目深度估计)
\nkeywords: monocular depth estimation(单目深度估计),transformer
\n[paper](https://arxiv.org/abs/2203.00838)

\n\n\n
\n\n \n\n### 人体解析/人体姿态估计(Human Parsing/Human Pose Estimation)\n\n[11] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision(自我监督下共同进化的 3D 人体姿势估计、模仿和幻觉)
\n[paper](https://arxiv.org/abs/2203.15625) | [code](https://github.com/Garfield-kh/PoseTriplet)

\n\n[10] Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes(从野外拥挤的场景中学习估计稳健的 3D 人体网格)
\n[paper](https://arxiv.org/abs/2104.07300) | [code](https://github.com/hongsukchoi/3DCrowdNet_RELEASE)

\n\n[9] Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization(用于单目绝对 3D 定位的基于射线的 3D 人体姿态估计)
\n[paper](https://arxiv.org/abs/2203.11471) | [code](https://github.com/YxZhxn/Ray3D)

\n\n[8] Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video(捕捉运动中的人类：来自单目视频的时间注意 3D 人体姿势和形状估计)
\n[paper](https://arxiv.org/abs/2203.08534) | [video](https://mps-net.github.io/MPS-Net/)

\n\n[7] Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors(来自稀疏惯性传感器的物理感知实时人体运动跟踪)
\n[paper](https://arxiv.org/abs/2203.08528) | [project](https://xinyu-yi.github.io/PIP/)

\n\n[6] Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation(用于多人 3D 姿势估计的分布感知单阶段模型)
\n[paper](https://arxiv.org/abs/2203.07697)

\n\n[5] MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation(用于 3D 人体姿势估计的多假设transformer)
\n[paper](https://arxiv.org/abs/2111.12707) | [code](https://github.com/Vegetebird/MHFormer)

\n\n[4] CDGNet: Class Distribution Guided Network for Human Parsing(用于人类解析的类分布引导网络)
\n[paper](https://arxiv.org/abs/2111.14173)

\n\n[3] Forecasting Characteristic 3D Poses of Human Actions(预测人类行为的特征 3D 姿势)
\n[paper](https://arxiv.org/abs/2011.15079) | [project](https://charposes.christian-diller.de/;) | [video](https://youtu.be/kVhn8OWMgME)

\n\n[2] Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation(学习用于多人姿势估计的局部-全局上下文适应)
\nkeywords:Top-Down Pose Estimation(从上至下姿态估计), Limb-based Grouping, Direct Regression

\n[paper](https://arxiv.org/pdf/2109.03622.pdf)

\n\n[1] MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video(用于视频中 3D 人体姿势估计的 Seq2seq 混合时空编码器)
\nkeywords：3D Human Pose Estimation, Transformer
\n[paper](https://arxiv.org/pdf/2203.00859.pdf)

\n\n
\n\n \n\n### 手势估计(Gesture Estimation)\n\n[1] ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis(通过在线探索和合成提升关节式 3D 手对象姿势估计)
\n[paper](https://arxiv.org/abs/2109.05488) | [code](https://github.com/lixiny/ArtiBoost)

\n\n\n
\n\n \n\n\n## 图像处理(Image Processing)\n\n
\n\n \n\n### 超分辨率(Super Resolution)\n\n[10] High-Resolution Image Harmonization via Collaborative Dual Transformations(通过协作双变换实现高分辨率图像协调)
\n[paper](https://arxiv.org/abs/2109.06671) | [code](https://github.com/bcmi/CDTNet-High-Resolution-Image-Harmonization)

\n\n[9] Deep Constrained Least Squares for Blind Image Super-Resolution(用于盲图像超分辨率的深度约束最小二乘)
\n[paper](https://arxiv.org/abs/2202.07508)

\n\n[8] Local Texture Estimator for Implicit Representation Function(隐式表示函数的局部纹理估计器)
\n[paper](https://arxiv.org/abs/2111.08918)

\n\n[7] A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution(一种用于空间变形鲁棒场景文本图像超分辨率的文本注意网络)
\n[paper](https://arxiv.org/abs/2203.09388) | [code](https://github.com/mjq11302010044/TATT)

\n\n[6] Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution(一种真实图像超分辨率的局部判别学习方法)
\n[paper](https://arxiv.org/abs/2203.09195) | [code](https://github.com/csjliang/LDL)

\n\n[5] Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel(对噪声和核进行精细退化建模的盲图像超分辨率)
\n[paper](https://arxiv.org/abs/2107.00986) | [code](https://github.com/zsyOAOA/BSRDM)

\n\n[4] Reflash Dropout in Image Super-Resolution(图像超分辨率中的闪退dropout)
\n[paper](https://arxiv.org/abs/2112.12089)

\n\n[3] Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence(迈向双向任意图像缩放：联合优化和循环幂等)
\n[paper](https://arxiv.org/abs/2203.00911)

\n\n[2] HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening(用于全色锐化的纹理和光谱特征融合Transformer)
\n[paper](https://arxiv.org/abs/2203.02503) ｜ [code](https://github.com/wgcban/HyperTransformer)

\n\n[1] HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging(光谱压缩成像的高分辨率双域学习)
\nkeywords: HSI Reconstruction, Self-Attention Mechanism, Image Frequency Spectrum Analysis
\n[paper](https://arxiv.org/pdf/2203.02149.pdf)

\n\n
\n\n \n\n### 图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)\n\n[7] HyperInverter: Improving StyleGAN Inversion via Hypernetwork(通过超网络改进 StyleGAN 反转)
\n[paper](https://arxiv.org/abs/2112.00719) | [project](https://di-mi-ta.github.io/HyperInverter/)

\n\n[6] Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation(用于高效 3DCG 背景创建的多样化合理 360 度图像外绘)
\n[paper](https://arxiv.org/abs/2203.14668) | [project](https://akmtn.github.io/omni-dreamer/)

\n\n[5] Exploring and Evaluating Image Restoration Potential in Dynamic Scenes(探索和评估动态场景中的图像复原潜力)
\n[paper](https://arxiv.org/abs/2203.11754)

\n\n[4] Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction(通过随机收缩加速逆问题的条件扩散模型)
\n[paper](https://arxiv.org/abs/2112.05146)

\n\n[3] Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction(用于高效高光谱图像重建的掩模引导光谱变换器)
\n[paper](https://arxiv.org/abs/2111.07910) | [code](https://github.com/caiyuanhao1998/MST/)

\n\n[2] Restormer: Efficient Transformer for High-Resolution Image Restoration(用于高分辨率图像复原的高效transformer)
\n[paper](https://arxiv.org/abs/2111.09881) | [code](https://github.com/swz30/Restormer)

\n\n[1] Event-based Video Reconstruction via Potential-assisted Spiking Neural Network(通过电位辅助尖峰神经网络进行基于事件的视频重建)
\n[paper](https://arxiv.org/pdf/2201.10943.pdf)

\n\n
\n\n\n \n\n### 图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)\n\n
\n\n\n\n \n\n### 图像去噪/去模糊/去雨去雾(Image Denoising)\n\n[6] CVF-SID: Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image(通过从图像中分离噪声的自监督图像去噪的循环多变量函数)
\n[paper](https://arxiv.org/abs/2203.13009) | [code](https://github.com/Reyhanehne/CVF-SID_PyTorch)

\n\n[5] Unpaired Deep Image Deraining Using Dual Contrastive Learning(使用双重对比学习的非配对深度图像去雨)
\n[paper](https://arxiv.org/abs/2109.02973) | [code](https://cxtalk.github.io/projects/DCD-GAN.html)

\n\n[4] AP-BSN: Self-Supervised Denoising for Real-World Images via Asymmetric PD and Blind-Spot Network(通过非对称 PD 和盲点网络对真实世界图像进行自监督去噪)
\n[paper](https://arxiv.org/abs/2203.11799) | [code](https://github.com/wooseoklee4/AP-BSN)

\n\n[3] IDR: Self-Supervised Image Denoising via Iterative Data Refinement(通过迭代数据细化的自监督图像去噪)
\n[paper](https://arxiv.org/abs/2111.14358) | [code](https://github.com/zhangyi-3/IDR)

\n\n[2] Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots(具有可见盲点的自监督图像去噪)
\n[paper](https://arxiv.org/abs/2203.06967) | [code](https://github.com/demonsjin/Blind2Unblind)

\n\n[1] E-CIR: Event-Enhanced Continuous Intensity Recovery(事件增强的连续强度恢复)
\nkeywords: Event-Enhanced Deblurring, Video Representation
\n[paper](https://arxiv.org/abs/2203.01935) | [code](https://github.com/chensong1995/E-CIR)

\n\n
\n\n \n\n### 图像编辑/图像修复(Image Edit/Inpainting)\n\n[6] HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing(用于真实图像编辑的超网络 StyleGAN 反演)
\n[paper](https://arxiv.org/abs/2111.15666) | [project](http://yuval-alaluf.github.io/hyperstyle/)

\n\n[5] High-Fidelity GAN Inversion for Image Attribute Editing(用于图像属性编辑的高保真 GAN 反演)
\n[paper](https://arxiv.org/abs/2109.06590) | [code](https://github.com/Tengfei-Wang/HFGI) | [project](https://tengfei-wang.github.io/HFGI/)

\n\n[4] Style Transformer for Image Inversion and Editing(用于图像反转和编辑的样式transformer)
\n[paper](https://arxiv.org/abs/2203.07932) | [code](https://github.com/sapphire497/style-transformer)

\n\n[3] MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting(用于高保真图像修复的多级交互式 Siamese 过滤)
\n[paper](https://arxiv.org/abs/2203.06304) | [code](https://github.com/tsingqguo/misf)

\n\n[2] HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)
\nkeywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks
\n[paper](https://arxiv.org/abs/2112.05142) | [project](https://github.com/wty-ustc/HairCLIP)

\n\n[1] Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding(增量transformer结构增强图像修复与掩蔽位置编码)
\nkeywords: Image Inpainting, Transformer, Image Generation

\n[paper](https://arxiv.org/abs/2203.00867) | [code](https://github.com/DQiaole/ZITS_inpainting)

\n\n
\n\n \n\n### 图像翻译(Image Translation)\n\n[6] Marginal Contrastive Correspondence for Guided Image Generation(引导图像生成的边际对比对应)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.00442)

\n\n[5] Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation(未配对图像到图像翻译的最大空间扰动一致性)
\n[paper](https://arxiv.org/abs/2203.12707) | [code](https://github.com/batmanlab/MSPC)

\n\n[4] Globetrotter: Connecting Languages by Connecting Images(通过连接图像连接语言)
\n[paper](https://arxiv.org/abs/2012.04631)

\n\n[3] QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation(图像翻译中对比学习的查询选择注意)
\n[paper](https://arxiv.org/abs/2203.08483) | [code](https://github.com/sapphire497/query-selected-attention)

\n\n[2] FlexIT: Towards Flexible Semantic Image Translation(迈向灵活的语义图像翻译)
\n[paper](https://arxiv.org/abs/2203.04705)

\n\n[1] Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks(探索图像到图像翻译任务中对比学习的补丁语义关系)
\nkeywords: image translation, knowledge transfer,Contrastive learning
\n[paper](https://arxiv.org/pdf/2203.01532.pdf)

\n\n
\n\n \n\n### 图像质量评估(Image Quality Assessment)\n\n
\n\n \n\n### 风格迁移(Style Transfer)\n\n[5] Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer(基于示例的高分辨率肖像风格转移)
\n[paper](https://arxiv.org/abs/2203.13248) | [code](https://github.com/williamyang1991/DualStyleGAN) | [project](https://www.mmlab-ntu.com/project/dualstylegan/)

\n\n[4] Industrial Style Transfer with Large-scale Geometric Warping and Content Preservation(具有大规模几何变形和内容保留的工业风格迁移)
\n[paper](https://arxiv.org/abs/2203.12835) | [project](https://jcyang98.github.io/InST/home.html) | [code](https://github.com/jcyang98/InST)

\n\n[3] Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization(任意风格迁移和域泛化的精确特征分布匹配)
\n[paper](https://arxiv.org/abs/2203.07740) | [code](https://github.com/YBZh/EFDM)

\n\n[2] Style-ERD: Responsive and Coherent Online Motion Style Transfer(响应式和连贯的在线运动风格迁移)
\n[paper](https://arxiv.org/abs/2203.02574)

\n\n[1] CLIPstyler: Image Style Transfer with a Single Text Condition(具有单一文本条件的图像风格转移)
\nkeywords: Style Transfer, Text-guided synthesis, Language-Image Pre-Training (CLIP)
\n[paper](https://arxiv.org/abs/2112.00374)

\n\n\n
\n\n \n\n## 人脸(Face)\n\n[6] ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations(具有隐式神经表示的非线性 3D 可变形人脸模型)
\n[paper](https://arxiv.org/abs/2203.14510)

\n\n[5] Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?(跨模态感知者：可以从声音中收集面部几何形状吗？)
\n[paper](https://arxiv.org/abs/2203.09824) | [project](https://choyingw.github.io/works/Voice2Mesh/index.html)

\n\n[4] Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic Data(利用 3D 合成数据去除人像眼镜和阴影)
\n[paper](https://arxiv.org/abs/2203.10474) | [code](https://github.com/StoryMY/take-off-eyeglasses)

\n\n[3] HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule Network(分层解析胶囊网络的无监督人脸部分发现)
\n[paper](https://arxiv.org/abs/2203.10699)

\n\n[2] FaceFormer: Speech-Driven 3D Facial Animation with Transformers(FaceFormer：带有transformer的语音驱动的 3D 面部动画)
\n[paper](https://arxiv.org/abs/2112.05329) | [code](https://evelynfan.github.io/audio2face/)

\n\n[1] Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning(用于鲁棒人脸对齐和地标固有关系学习的稀疏局部补丁transformer)
\n[paper](https://arxiv.org/abs/2203.06541) | [code](https://github.com/Jiahao-UTS/SLPT-master)

\n\n
\n\n \n\n### 人脸识别/检测(Facial Recognition/Detection)\n\n\n[4] DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover's Distance Improves Out-Of-Distribution Face Identification(使用 Patch-wise Earth Mover 的距离重新排序改进了分布外人脸识别)
\n[paper](https://arxiv.org/abs/2112.04016) | [code](https://github.com/anguyen8/deepface-emd)

\n\n[3] Towards Semi-Supervised Deep Facial Expression Recognition with An Adaptive Confidence Margin(具有自适应置信度的半监督深度面部表情识别)
\n[paper](https://arxiv.org/abs/2203.12341) | [code](https://github.com/hangyu94/Ada-CM)

\n\n[2] Privacy-preserving Online AutoML for Domain-Specific Face Detection(用于特定领域人脸检测的隐私保护在线 AutoML)
\n[paper](https://arxiv.org/abs/2203.08399)

\n\n[1] An Efficient Training Approach for Very Large Scale Face Recognition(一种有效的超大规模人脸识别训练方法)
\n[paper](https://arxiv.org/pdf/2105.10375.pdf) | [code](https://github.com/tiandunx/FFC)

\n\n
\n\n \n\n### 人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)\n\n[4] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing(基于 Transformer 的双空间 GAN 用于高度可控的面部编辑)
\n[paper](https://arxiv.org/abs/2203.17266) | [code](https://github.com/BillyXYB/TransEditor) | [project](https://billyxyb.github.io/TransEditor/)

\n\n[3] FENeRF: Face Editing in Neural Radiance Fields(神经辐射场中的人脸编辑)
\n[paper](https://arxiv.org/abs/2111.15490) | [project](https://mrtornado24.github.io/FENeRF/)

\n\n[2] GCFSR: a Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors(一种没有面部和 GAN 先验的生成可控人脸超分辨率方法)
\n[paper](https://arxiv.org/abs/2203.07319)

\n\n[1] Sparse to Dense Dynamic 3D Facial Expression Generation(稀疏到密集的动态 3D 面部表情生成)
\nkeywords: Facial expression generation, 4D face generation, 3D face modeling
\n[paper](https://arxiv.org/pdf/2105.07463.pdf)

\n\n
\n\n \n\n### 人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)\n\n\n[4] Self-supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection(对抗样本的自监督学习：迈向 Deepfake 检测的良好泛化)
\n[paper](https://arxiv.org/abs/2203.12208) | [code](https://github.com/liangchen527/SLADD)

\n\n[3] Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing(通过 Shuffled Style Assembly 进行域泛化以进行人脸反欺骗)
\n[paper](https://arxiv.org/abs/2203.05340) | [code](https://github.com/wangzhuo2019/SSAN)

\n\n[2] Voice-Face Homogeneity Tells Deepfake
\n[paper](https://arxiv.org/abs/2203.02195) | [code](https://github.com/xaCheng1996/VFD)

\n\n[1] Protecting Celebrities from DeepFake with Identity Consistency Transformer(使用身份一致性转换器保护名人免受 DeepFake 的影响)
\n[paper](https://arxiv.org/abs/2203.01318) | [code](https://github.com/LightDXY/ICT_DeepFake)

\n\n\n
\n\n \n\n## 目标跟踪(Object Tracking)\n\n[9] Unsupervised Learning of Accurate Siamese Tracking(准确连体跟踪的无监督学习)
\n[paper](https://arxiv.org/abs/2204.01475) | [code](https://github.com/FlorinShum/ULAST)

\n\n[8] Global Tracking Transformers
\n[paper](https://arxiv.org/abs/2203.13250) | [code](https://github.com/xingyizhou/GTR)

\n\n[7] Transforming Model Prediction for Tracking(转换模型预测以进行跟踪)
\n[paper](https://arxiv.org/abs/2203.11192) | [code](https://github.com/visionml/pytracking)

\n\n[6] MixFormer: End-to-End Tracking with Iterative Mixed Attention(具有迭代混合注意力的端到端跟踪)
\n[paper](https://arxiv.org/abs/2203.11082) | [code](https://github.com/MCG-NJU/MixFormer)

\n\n[5] Unsupervised Domain Adaptation for Nighttime Aerial Tracking(夜间空中跟踪的无监督域自适应)
\n[paper](https://arxiv.org/abs/2203.10541) | [code](https://github.com/vision4robotics/UDAT)

\n\n\n[4] Iterative Corresponding Geometry: Fusing Region and Depth for Highly Efficient 3D Tracking of Textureless Objects(迭代对应几何：融合区域和深度以实现无纹理对象的高效 3D 跟踪)
\n[paper](https://arxiv.org/abs/2203.05334) | [code](https://github.com/DLR- RM/3DObjectTracking)

\n\n[3] TCTrack: Temporal Contexts for Aerial Tracking(空中跟踪的时间上下文)
\n[paper](https://arxiv.org/abs/2203.01885) | [code](https://github.com/vision4robotics/TCTrack)

\n\n[2] Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds(超越 3D 连体跟踪：点云中 3D 单对象跟踪的以运动为中心的范式)
\nkeywords: Single Object Tracking, 3D Multi-object Tracking / Detection, Spatial-temporal Learning on Point Clouds
\n[paper](https://arxiv.org/abs/2203.01730)

\n\n[1] Correlation-Aware Deep Tracking(相关感知深度跟踪)
\n[paper](https://arxiv.org/abs/2203.01666)

\n\n
\n \n\n## 图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)\n\n[7] Correlation Verification for Image Retrieval(图像检索的相关性验证)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.01458) | [code](https://github.com/sungonce/CVNet)

\n\n[6] It's About Time: Analog Clock Reading in the Wild(时间到了：野外模拟时钟读数)
\n[paper](https://arxiv.org/abs/2111.09162) | [project](https://www.robots.ox.ac.uk/~vgg/research/time)

\n\n[5] Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval(无忧素描：基于素描的抗噪图像检索)
\n[paper](https://arxiv.org/abs/2203.14817) | [code](https://github.com/AyanKumarBhunia/Stroke_Subset_Selector-for-FGSBIR)

\n\n[4] Partially Does It: Towards Scene-Level FG-SBIR with Partial Input(走向带有部分输入的场景级 FG-SBIR)
\n[paper](https://arxiv.org/abs/2203.14804)

\n\n[3] Sketch3T: Test-Time Training for Zero-Shot SBIR(零样本 SBIR 的测试时间训练)
\n[paper](https://arxiv.org/abs/2203.14691)

\n\n[2] Bridging Video-text Retrieval with Multiple Choice Questions(桥接视频文本检索与多项选择题)
\n[paper](https://arxiv.org/abs/2201.04850) | [code](https://github.com/TencentARC/MCQ)

\n\n[1] BEVT: BERT Pretraining of Video Transformers(视频Transformer的 BERT 预训练)
\nkeywords: Video understanding, Vision transformers, Self-supervised representation learning, BERT pretraining
\n[paper](https://arxiv.org/abs/2112.01529) | [code](https://github.com/xyzforever/BEVT)

\n\n\n\n \n\n### 行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)\n\n[19] Revisiting Skeleton-based Action Recognition(重新审视基于骨架的动作识别)(**Oral**)
\n[paper](https://arxiv.org/abs/2104.13586) | [code](https://github.com/kennymckormick/pyskl)

\n\n[18] UnweaveNet: Unweaving Activity Stories(解开活动故事)
\n[paper](https://arxiv.org/abs/2112.10194) | [code](https://github.com/willprice/activity- stories)

\n\n[17] Dual-AI: Dual-path Action Interaction Learning for Group Activity Recognition(用于群体动作识别的双路径动作交互学习)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.02148) | [project](https://arxiv.org/pdf/2204.02148)

\n\n[16] Detector-Free Weakly Supervised Group Activity Recognition(无检测器弱监督群体动作识别)
\n[paper](https://arxiv.org/abs/2204.02139)

\n\n[15] MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection(用于动作检测的多尺度时间 ConvTransformer)
\n[paper](https://arxiv.org/abs/2112.03902) | [code](https://github.com/dairui01/MS-TCT)

\n\n[14] Unsupervised Pre-training for Temporal Action Localization Tasks(时间动作定位任务的无监督预训练)
\n[paper](https://arxiv.org/abs/2203.13609) | [code](https://github.com/zhang-can/UP-TAL)

\n\n[13] Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos(多视图教学视频中的弱监督在线动作分割)
\n[paper](https://arxiv.org/abs/2203.13309)

\n\n[12] How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs(你怎么做呢？使用伪副词进行细粒度的动作理解)
\n[paper](https://arxiv.org/abs/2203.12344)

\n\n[11] E2(GO)MOTION: Motion Augmented Event Stream for Egocentric Action Recognition(用于以自我为中心的动作识别的运动增强事件流)
\n[paper](https://arxiv.org/abs/2112.03596)

\n\n[10] Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos(寻找变化：从未修剪的网络视频中学习对象状态和状态修改操作)
\n[paper](https://arxiv.org/abs/2203.11637) | [code](https://github.com/zju-vipa/MEAT-TIL)

\n\n[9] DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition(鲁棒动作识别的 Transformer 方法中的定向注意)
\n[paper](https://arxiv.org/abs/2203.10233)

\n\n[8] Self-supervised Video Transformer(自监督视频transformer)
\n[paper](https://arxiv.org/abs/2112.01514) | [code](https://git.io/J1juJ)

\n\n[7] Spatio-temporal Relation Modeling for Few-shot Action Recognition(小样本动作识别的时空关系建模)
\n[paper](https://arxiv.org/abs/2112.05132) | [code](https://github.com/Anirudh257/strm)

\n\n[6] RCL: Recurrent Continuous Localization for Temporal Action Detection(用于时间动作检测的循环连续定位)
\n[paper](https://arxiv.org/abs/2203.07112)

\n\n[5] OpenTAL: Towards Open Set Temporal Action Localization(走向开放集时间动作定位)
\n[paper](https://arxiv.org/abs/2203.05114) | [code](https://www.rit.edu/actionlab/opental)

\n\n[4] End-to-End Semi-Supervised Learning for Video Action Detection(视频动作检测的端到端半监督学习)
\n[paper](https://arxiv.org/abs/2203.04251)

\n\n[3] Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos(模态特定注释视频上多模态动作识别的可学习不相关模态丢失)
\n[paper](https://arxiv.org/abs/2203.03014)

\n\n[2] Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation(通过代表性片段知识传播的弱监督时间动作定位)
\n[paper](https://arxiv.org/abs/2203.02925) | [code](https://github.com/LeonHLJ/RSKP)

\n\n[1] Colar: Effective and Efficient Online Action Detection by Consulting Exemplars(通过咨询示例进行有效且高效的在线动作检测)
\nkeywords:Online action detection(在线动作检测)
\n[paper](https://arxiv.org/pdf/2203.01057.pdf)

\n\n \n\n### 行人重识别/检测(Re-Identification/Detection)\n\n[4] Camera-Conditioned Stable Feature Generation for Isolated Camera Supervised Person Re-IDentification(用于孤立摄像机监督行人重识别的摄像机条件稳定特征生成)
\n[paper](https://arxiv.org/abs/2203.15210) | [code](https://github.com/ftd- Wuchao/CCSFG)

\n\n[3] Large-Scale Pre-training for Person Re-identification with Noisy Labels(带有噪声标签的人员重新识别的大规模预训练)
\n[paper](https://arxiv.org/abs/2203.16533) | [code](https://github.com/DengpanFu/LUPerson-NL)

\n\n[2] Part-based Pseudo Label Refinement for Unsupervised Person Re-identification(用于无监督人员重新识别的基于部分的伪标签细化)
\n[paper](https://arxiv.org/abs/2203.14675) | [code](https://github.com/yoonkicho/PPLR)

\n\n[1] Cascade Transformers for End-to-End Person Search(用于端到端人员搜索的级联transformer)
\n[paper](https://arxiv.org/abs/2203.09642) | [code](https://github.com/Kitware/COAT)

\n\n \n\n### 图像/视频字幕(Image/Video Caption)\n\n[6] Quantifying Societal Bias Amplification in Image Captioning(量化图像字幕中的社会偏见放大)
\n[paper](https://arxiv.org/abs/2203.15395)

\n\n[5] NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge(从外部知识中检索词汇的新颖对象字幕)
\n[paper](https://arxiv.org/abs/2203.14499)

\n\n[4] SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning(用于视频字幕的具有稀疏注意力的端到端transformer)
\n[paper](https://arxiv.org/abs/2111.13196) | [code](https://github.com/microsoft/SwinBERT)

\n\n[3] Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources(通过在线资源对上下文外图像进行开放域、基于内容、多模式的事实检查)
\n[paper](https://arxiv.org/abs/2112.00061) | [code](https://s-abdelnabi.github.io/OoC-multi-modal-fc/)

\n\n[2] Hierarchical Modular Network for Video Captioning(用于视频字幕的分层模块化网络)
\n[paper](https://arxiv.org/abs/2111.12476) | [code](https://github.com/MarcusNerva/HMN)

\n\n[1] X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)\nkeywords：Image Captioning and Dense Captioning(图像字幕/密集字幕)；Knowledge distillation(知识蒸馏)；Transformer；3D Vision(三维视觉)
\n[paper](https://arxiv.org/pdf/2203.00843.pdf)

\n\n\n \n\n## 医学影像(Medical Imaging)\n\n[8] Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis(用于 3D 医学图像分析的 Swin Transformers 的自监督预训练)
\n[paper](https://arxiv.org/abs/2111.14791) | [code](https://monai.io/research/swin-unetr)

\n\n[7] Incremental Cross-view Mutual Distillation for Self-supervised Medical CT Synthesis(用于自监督医学 CT 合成的增量交叉视图相互蒸馏)
\n[paper](https://arxiv.org/abs/2112.10325)

\n\n[6] DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification(用于组织病理学全幻灯片图像分类的双层特征蒸馏多实例学习)
\n[paper](https://arxiv.org/abs/2203.12081) | [code](https://github.com/hrzhang1123/DTFD-MIL)

\n\n[5] ACPL: Anti-curriculum Pseudo-labelling for Semi-supervised Medical Image Classification(半监督医学图像分类的反课程伪标签)
\n[paper](https://arxiv.org/abs/2111.12918)

\n\n[4] Vox2Cortex: Fast Explicit Reconstruction of Cortical Surfaces from 3D MRI Scans with Geometric Deep Neural Networks(使用几何深度神经网络从 3D MRI 扫描中快速显式重建皮质表面)
\n[paper](https://arxiv.org/abs/2203.09446) | [code](https://github.com/ai-med/Vox2Cortex)

\n\n[3] Generalizable Cross-modality Medical Image Segmentation via Style Augmentation and Dual Normalization(通过风格增强和双重归一化的可泛化跨模态医学图像分割)
\n[paper](https://arxiv.org/abs/2112.11177) | [code](https://github.com/zzzqzhou/Dual-Normalization)

\n\n[2] Adaptive Early-Learning Correction for Segmentation from Noisy Annotations(从噪声标签中分割的自适应早期学习校正)
\nkeywords: medical-imaging segmentation, Noisy Annotations
\n[paper](https://arxiv.org/abs/2110.03740) | [code](https://github.com/Kangningthu/ADELE)

\n\n[1] Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations(时间上下文很重要：使用疾病进展表示增强单图像预测)
\nkeywords: Self-supervised Transformer, Temporal modeling of disease progression
\n[paper](https://arxiv.org/abs/2203.01933)

\n\n\n \n\n\n## 文本检测/识别/理解(Text Detection/Recognition/Understanding)\n\n[5] Text Spotting Transformers(文本识别transformer)
\n[paper](https://arxiv.org/abs/2204.01918) | [code](https://github.com/mlpc- ucsd/TESTR)

\n\n[4] Syntax-Aware Network for Handwritten Mathematical Expression Recognition(用于手写数学表达式识别的语法感知网络)
\n[paper](https://arxiv.org/abs/2203.01601)

\n\n[3] SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition(通过文本检测和文本识别之间更好的协同作用进行场景文本定位)
\n[paper](https://arxiv.org/abs/2203.10209) | [code](https://github.com/mxin262/SwinTextSpotter)

\n\n[2] Fourier Document Restoration for Robust Document Dewarping and Recognition(用于鲁棒文档去扭曲和识别的傅里叶文档恢复)
\n[paper](https://arxiv.org/abs/2203.09910) | [code](https://sg-vilab.github.io/event/warpdoc/)

\n\n[1] XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding(迈向布局感知多模式网络，以实现视觉丰富的文档理解)
\n[paper](https://arxiv.org/abs/2203.06947)

\n\n\n\n \n\n## 遥感图像(Remote Sensing Image)\n\n[1] Exploiting Temporal Relations on Radar Perception for Autonomous Driving(利用自动驾驶雷达感知的时间关系)
\n[paper](https://arxiv.org/abs/2204.01184)

\n\n\n \n\n## GAN/生成式/对抗式(GAN/Generative/Adversarial)\n\n[16] GAN-Supervised Dense Visual Alignment(GAN监督的密集视觉对齐)(**Oral**)
\n[paper](https://arxiv.org/abs/2112.05143) | [code](https://www.github.com/wpeebles/gangealing) | [project](https://www.wpeebles.com/gangealing)

\n\n[15] Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond(迈向强大的雨水清除对抗对抗性攻击：综合基准分析及其他)
\n[paper](https://arxiv.org/abs/2203.16931) | [code](https://github.com/yuyi-sd/Robust_Rain_Removal)

\n\n[14] Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training(了解 Frank-Wolfe 对抗训练并提高效率)
\n[paper](https://arxiv.org/abs/2012.12368) | [code](https://github.com/TheoT1/FW-AT-Adapt)

\n\n[13] Feature Statistics Mixing Regularization for Generative Adversarial Networks(生成对抗网络的特征统计混合正则化)
\n[paper](https://arxiv.org/abs/2112.04120) | [code](https://github.com/naver-ai/FSMR)

\n\n[12] Subspace Adversarial Training(子空间对抗训练)
\n[paper](https://arxiv.org/abs/2111.12229) | [code](https://github.com/nblt/Sub-AT)

\n\n[11] DTA: Physical Camouflage Attacks using Differentiable Transformation Network(使用可微变换网络的物理伪装攻击)
\n[paper](https://arxiv.org/abs/2203.09831) | [code](https://islab-ai.github.io/dta-cvpr2022/)

\n\n[10] Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input(通过基于对象的多样化输入提高目标对抗样本的可迁移性)
\n[paper](https://arxiv.org/abs/2203.09123) | [code](https://github.com/dreamflake/ODI)

\n\n[9] Towards Practical Certifiable Patch Defense with Vision Transformer(使用 Vision Transformer 实现实用的可认证补丁防御)
\n[paper](https://arxiv.org/abs/2203.08519)
\n\n[8] Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment(基于松弛空间结构对齐的小样本生成模型自适应)
\n[paper](https://arxiv.org/abs/2203.04121)

\n\n[7] Enhancing Adversarial Training with Second-Order Statistics of Weights(使用权重的二阶统计加强对抗训练)
\n[paper](https://arxiv.org/abs/2203.06020) | [code](https://github.com/Alexkael/S2O)

\n\n[6] Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack(通过自适应自动攻击对对抗鲁棒性的实际评估)
\n[paper](https://arxiv.org/abs/2203.05154) | [code1](https://github.com/liuye6666/adaptive_auto_attack) | [code2](https://github.com/liuye6666/adaptive)

\n\n[5] Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity(对语义相似性的频率驱动的不可察觉的对抗性攻击)
\n[paper](https://arxiv.org/abs/2203.05151)\n\n[4] Shadows can be Dangerous: Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon(阴影可能很危险：自然现象的隐秘而有效的物理世界对抗性攻击)
\n[paper](https://arxiv.org/abs/2203.03818)

\n\n[3] Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer(保护面部隐私：通过风格稳健的化妆转移生成对抗性身份面具)
\n[paper](https://arxiv.org/pdf/2203.03121.pdf)

\n\n[2] Adversarial Texture for Fooling Person Detectors in the Physical World(物理世界中愚弄人探测器的对抗性纹理)
\n[paper](https://arxiv.org/abs/2203.03373)

\n\n[1] Label-Only Model Inversion Attacks via Boundary Repulsion(通过边界排斥的仅标签模型反转攻击)
\n[paper](https://arxiv.org/pdf/2203.01925.pdf)
\n\n\n
\n\n \n\n## 图像生成/图像合成(Image Generation/Image Synthesis)\n\n[13] Exemplar-bsaed Pattern Synthesis with Implicit Periodic Field Network(具有隐式周期场网络的示例模式合成)
\n[paper](https://arxiv.org/abs/2204.01671)

\n\n[12] Styleformer: Transformer based Generative Adversarial Networks with Style Vector(具有样式向量的基于 Transformer 的生成对抗网络)
\n[paper](https://arxiv.org/abs/2106.07023) | [code](https://github.com/Jeeseung-Park/Styleformer)

\n\n[11] Modulated Contrast for Versatile Image Synthesis(用于多功能图像合成的调制对比度)
\n[paper](https://arxiv.org/abs/2203.09333) | [code](https://github.com/fnzhan/MoNCE)

\n\n[10] Attribute Group Editing for Reliable Few-shot Image Generation(属性组编辑用于可靠的小样本图像生成)
\n[paper](https://arxiv.org/abs/2203.08422) | [code](https://github.com/UniBester/AGE)

\n\n[9] Text to Image Generation with Semantic-Spatial Aware GAN(使用语义空间感知 GAN 生成文本到图像)
\n[paper](https://arxiv.org/abs/2104.00567) | [code](https://github.com/wtliao/text2image)

\n\n[8] Playable Environments: Video Manipulation in Space and Time(可播放环境：空间和时间的视频操作)
\n[paper](https://arxiv.org/abs/2203.01914) | [code](https://willi-menapace.github.io/playable-environments-website)

\n\n\n[7] FLAG: Flow-based 3D Avatar Generation from Sparse Observations(从稀疏观察中生成基于流的 3D 头像)
\n[paper](https://arxiv.org/abs/2203.05789) | [project](https://microsoft.github.io/flag)

\n\n[6] Dynamic Dual-Output Diffusion Models(动态双输出扩散模型)
\n[paper](https://arxiv.org/abs/2203.04304)

\n\n[5] Exploring Dual-task Correlation for Pose Guided Person Image Generation(探索姿势引导人物图像生成的双任务相关性)
\n[paper](https://arxiv.org/abs/2203.02910) | [code](https://github.com/PangzeCheung/Dual-task-Pose-Transformer-Network)

\n\n\n\n[4] 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces(基于小批量特征交换的三维形状变化自动编码器潜在解纠缠)
\n[paper](https://arxiv.org/pdf/2111.12448.pdf) | [code](https://github.com/simofoti/3DVAE-SwapDisentangled)

\n\n[3] Interactive Image Synthesis with Panoptic Layout Generation(具有全景布局生成的交互式图像合成)
\n[paper])(https://arxiv.org/abs/2203.02104)

\n\n[2] Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values(极性采样：通过奇异值对预训练生成网络的质量和多样性控制)
\n[paper](https://arxiv.org/abs/2203.01993) | [demo](http://bit.ly/polarity-demo-colab)

\n\n[1] Autoregressive Image Generation using Residual Quantization(使用残差量化的自回归图像生成)
\n[paper](https://arxiv.org/abs/2203.01941) | [code](https://github.com/kakaobrain/rq-vae-transformer)

\n\n\n\n\n\n
\n\n \n\n## 三维视觉(3D Vision)\n\n[5] Fast Point Transformer
\n[paper](https://arxiv.org/abs/2112.04702) | [project](http://cvlab.postech.ac.kr/research/FPT)

\n\n[4] Towards Implicit Text-Guided 3D Shape Generation(迈向隐式文本引导的 3D 形状生成)
\n[paper](https://arxiv.org/abs/2203.14622) | [code](https://github.com/liuzhengzhe/Towards-Implicit)

\n\n[3] The Neurally-Guided Shape Parser: Grammar-based Labeling of 3D Shape Regions with Approximate Inference(神经引导的形状解析器：具有近似推理的 3D 形状区域的基于语法的标记)
\n[paper](https://arxiv.org/abs/2106.12026) | [code](https://github.com/rkjones4/NGSP)

\n\n[2] Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings(在 3D 网格中嵌入消息并从 2D 渲染中提取它们)
\n[paper](https://arxiv.org/abs/2104.13450)

\n\n[1] X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)\n关键词：图像字幕/密集字幕；知识蒸馏；Transformer；三维视觉
\n[paper](https://arxiv.org/pdf/2203.00843.pdf)

\n\n
\n\n \n\n### 点云(Point Cloud)\n\n[14] REGTR: End-to-end Point Cloud Correspondences with Transformers(与 Transformer 的端到端点云匹配)
\n[paper](https://arxiv.org/abs/2203.14517) | [code](https://github.com/yewzijian/RegTR)

\n\n[13] Stratified Transformer for 3D Point Cloud Segmentation(用于 3D 点云分割的分层transformer)
\n[paper](https://arxiv.org/abs/2203.14508) | [code](https://github.com/dvlab-research/Stratified-Transformer)

\n\n[12] AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception(利用点云的径向对称性进行方位归一化 3D 感知)
\n[paper](https://arxiv.org/abs/2203.13090) | [code](https://github.com/hustvl/AziNorm)

\n\n[11] WarpingGAN: Warping Multiple Uniform Priors for Adversarial 3D Point Cloud Generation(为对抗性 3D 点云生成扭曲多个均匀先验)
\n[paper](https://arxiv.org/abs/2203.12917) | [code](https://github.com/yztang4/WarpingGAN.git)

\n\n[10] IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment(通过深度嵌入对齐的动态 3D 点云插值)
\n[paper](https://arxiv.org/abs/2203.11590) | [code](https://github.com/ZENGYIMING-EAMON/IDEA-Net.git)

\n\n[9] No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static Models by Fitting Feature-level Space-time Surfaces(没有痛苦，收获很大：通过拟合特征级时空表面，用静态模型对动态点云序列进行分类)
\n[paper](https://arxiv.org/abs/2203.11113) | [code](https://github.com/jx-zhong-for-academic-purpose/Kinet)

\n\n[8] AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation(通用 3D 零件分割的中间监督搜索)\n[paper](https://arxiv.org/abs/2203.06558)

\n\n[7] Geometric Transformer for Fast and Robust Point Cloud Registration(用于快速和稳健点云配准的几何transformer)
\n[paper](https://arxiv.org/abs/2202.06688) | [code](https://github.com/qinzheng93/GeoTransformer)

\n\n[6] Contrastive Boundary Learning for Point Cloud Segmentation(点云分割的对比边界学习)
\n[paper](https://arxiv.org/abs/2203.05272) | [code](https://github.com/LiyaoTang/contrastBoundary)

\n\n[5] Shape-invariant 3D Adversarial Point Clouds(形状不变的 3D 对抗点云)
\n[paper](https://arxiv.org/abs/2203.04041) | [code](https://github.com/shikiw/SI-Adv)

\n\n[4] ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation(通过对抗旋转提高点云分类器的旋转鲁棒性)
\n[paper](https://arxiv.org/abs/2203.03888)

\n\n[3] Lepard: Learning partial point cloud matching in rigid and deformable scenes(Lepard：在刚性和可变形场景中学习部分点云匹配)
\n[paper](https://arxiv.org/abs/2111.12591) | [code](https://github.com/rabbityl/lepard)

\n\n[2] A Unified Query-based Paradigm for Point Cloud Understanding(一种基于统一查询的点云理解范式)
\n[paper](https://arxiv.org/pdf/2203.01252.pdf)

\n\n[1] CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding(用于 3D 点云理解的自监督跨模态对比学习)
\nkeywords: Self-Supervised Learning, Contrastive Learning, 3D Point Cloud, Representation Learning, Cross-Modal Learning
\n[paper](https://arxiv.org/abs/2203.00680) | [code](http://github.com/MohamedAfham/CrossPoint)

\n\n
\n\n\n \n\n### 三维重建(3D Reconstruction)\n\n[17] I M Avatar: Implicit Morphable Head Avatars from Videos(视频中的隐式可变形头部头像)(**Oral**)
\n[paper](https://arxiv.org/abs/2112.07471) | [project](https://ait.ethz.ch/projects/2022/IMavatar/)

\n\n[16] BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion(使用双层神经体积融合的密集 3D 重建)
\n[paper](https://arxiv.org/abs/2204.01139)

\n\n[15] SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video(从单目视频自我重建你的数字化身)(**Oral**)
\n[paper](https://arxiv.org/abs/2201.12792) | [code](https://github.com/jby1993/SelfReconCode)

\n\n[14] LISA: Learning Implicit Shape and Appearance of Hands(学习手的隐式形状和外观)
\n[paper](https://arxiv.org/abs/2204.01695) | [project](https://www.iri.upc.edu/people/ecorona/lisa/)

\n\n[13] BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information(通过利用品种信息学习从图像中回归 3D 狗形状)
\n[paper](https://arxiv.org/abs/2203.15536) | [code](https://barc.is.tue.mpg.de/)

\n\n[12] Uncertainty-Aware Deep Multi-View Photometric Stereo(不确定性感知深度多视图光度立体)
\n[paper](https://arxiv.org/abs/2202.13071)

\n\n[11] Neural Reflectance for Shape Recovery with Shadow Handling(使用阴影处理进行形状恢复的神经反射)
\n[paper](https://arxiv.org/abs/2203.12909) | [code](https://github.com/junxuan-li/Neural-Reflectance-PS)

\n\n[10] PLAD: Learning to Infer Shape Programs with Pseudo-Labels and Approximate Distributions(学习用伪标签和近似分布推断形状程序)
\n[paper](https://arxiv.org/abs/2011.13045) | [code](https://github.com/rkjones4/PLAD)

\n\n[9] ϕ-SfT: Shape-from-Template with a Physics-Based Deformation Model(具有基于物理的变形模型的模板形状)
\n[paper](https://arxiv.org/abs/2203.11938) | [code](https://4dqv.mpi-inf.mpg.de/phi-SfT/)

\n\n[8] Input-level Inductive Biases for 3D Reconstruction(用于 3D 重建的输入级归纳偏差)
\n[paper](https://arxiv.org/abs/2112.03243)

\n\n[7] AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation(用于 3D 完成、重建和生成的形状先验)
\n[paper](https://arxiv.org/abs/2203.09516) | [project](https://yccyenchicheng.github.io/AutoSDF/)

\n\n[6] Interacting Attention Graph for Single Image Two-Hand Reconstruction(单幅图像双手重建的交互注意力图)
\n[paper](https://arxiv.org/abs/2203.09364) | [code](https://github.com/Dw1010/IntagHand)

\n\n[5] OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D Reconstruction(实时动态 3D 重建的遮挡感知运动估计)
\n[paper](https://arxiv.org/abs/2203.07977) | [project](https://wenbin-lin.github.io/OcclusionFusion)

\n\n\n[4] Neural RGB-D Surface Reconstruction(神经 RGB-D 表面重建)
\n[paper](https://arxiv.org/abs/2104.04532) | [project](https://dazinovic.github.io/neural-rgbd-surface-reconstruction/) | [video](https://youtu.be/iWuSowPsC3g)

\n\n[3] Neural Face Identification in a 2D Wireframe Projection of a Manifold Object(流形对象的二维线框投影中的神经人脸识别)
\n[paper](https://arxiv.org/abs/2203.04229) | [code](https://manycore- research.github.io/faceformer) | [project](https://manycore-research.github.io/faceformer)

\n\n[2] Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers(使用伤口分割和重建生成 3D 生物可打印贴片以治疗糖尿病足溃疡)
\nkeywords: semantic segmentation, 3D reconstruction, 3D bio-printers
\n[paper](https://arxiv.org/pdf/2203.03814.pdf)
\n\n[1] H4D: Human 4D Modeling by Learning Neural Compositional Representation(通过学习神经组合表示进行人体 4D 建模)
\nkeywords: 4D Representation(4D 表征),Human Body Estimation(人体姿态估计),Fine-grained Human Reconstruction(细粒度人体重建)

\n[paper](https://arxiv.org/pdf/2203.01247.pdf)
\n\n \n\n### 场景重建/视图合成/新视角合成(Novel View Synthesis)\n\n[17] RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo(学习基于光线的 1D 隐式场以实现准确的多视图立体)
\n[paper](https://arxiv.org/abs/2204.01320)

\n\n[16] Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis(用于可控 3D 人体合成的表面对齐神经辐射场)
\n[paper](https://arxiv.org/abs/2201.01683) | [project](https://pfnet-research.github.io/surface-aligned-nerf/)

\n\n[15] IRON: Inverse Rendering by Optimizing Neural SDFs and Materials from Photometric Images(通过优化来自光度图像的神经 SDF 和材料进行反向渲染)
\n[paper](https://arxiv.org/abs/2204.02232) | [project](https://kai-46.github.io/IRON-website/)

\n\n[14] MonoScene: Monocular 3D Semantic Scene Completion(单目 3D 语义场景完成)
\n[paper](https://arxiv.org/abs/2112.00726) | [code](https://github.com/cv-rits/MonoScene) | [project](https://cv-rits.github.io/MonoScene/)

\n\n[13] Stereo Magnification with Multi-Layer Images(具有多层图像的立体放大)
\n[paper](https://arxiv.org/abs/2201.05023) | [code](https://samsunglabs.github.io/StereoLayers)

\n\n[12] Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations(通过集合潜在场景表示的无几何新颖视图合成)
\n[paper](https://arxiv.org/abs/2111.13152) | [project](https://srt-paper.github.io/)

\n\n[11] Neural Rays for Occlusion-aware Image-based Rendering(用于遮挡感知的基于图像的渲染的神经射线)
\n[paper](https://arxiv.org/abs/2107.13421) | [project](https://liuyuan-pal.github.io/NeuRay/) | [code](https://github.com/liuyuan-pal/NeuRay)

\n\n[10] Deblur-NeRF: Neural Radiance Fields from Blurry Images(来自模糊图像的神经辐射场)
\n[paper](https://arxiv.org/abs/2111.14292) | [code](https://limacv.github.io/deblurnerf/)

\n\n[9] NPBG++: Accelerating Neural Point-Based Graphics(加速基于神经点的图形)
\n[paper](https://arxiv.org/abs/2203.13318) | [project](https://rakhimovv.github.io/npbgpp/)

\n\n[8] PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo(从多视图立体重建 3D 平面)
\n[paper](https://arxiv.org/abs/2203.12082)

\n\n[7] NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction(用于大规模场景重建的融合辐射场)
\n[paper](https://arxiv.org/abs/2203.11283)

\n\n[6] GeoNeRF: Generalizing NeRF with Geometry Priors(用几何先验概括 NeRF)
\n[paper](https://arxiv.org/abs/2111.13539) | [code](https://www.idiap.ch/paper/geonerf)

\n\n[5] StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions(室内 3D 场景重建的风格转换)
\n[paper](https://arxiv.org/abs/2112.01530) | [code](https://github.com/lukasHoel/stylemesh) | [project](https://lukashoel.github.io/stylemesh/)

\n\n[4] Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image(向外看：从单个图像合成一致的长期 3D 场景视频)
\n[paper](https://arxiv.org/abs/2203.09457) | [code](https://github.com/xrenaa/Look-Outside-Room) | [project](https://xrenaa.github.io/look-outside-room/)

\n\n[3] Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)
\n[paper](https://arxiv.org/abs/2201.08845) ｜ [code](https://github.com/Xharlie/pointnerf) |[project](https://xharlie.github.io/projects/project_sites/pointnerf)

\n\n[2] CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)
\nkeywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)
\n[paper](https://arxiv.org/abs/2112.05139) | [code](https://cassiepython.github.io/clipnerf/)

\n\n[1] Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)
\n[paper](https://arxiv.org/pdf/2201.08845.pdf) | [code](https://github.com/Xharlie/pointnerf) | [project](https://xharlie.github.io/projects/project_sites/pointnerf/index.html)

\n\n \n\n## 模型压缩(Model Compression)\n\n
\n\n \n\n### 知识蒸馏(Knowledge Distillation)\n\n[4] Decoupled Knowledge Distillation(解耦知识蒸馏)
\n[paper](https://arxiv.org/abs/2203.08679) | [code](https://github.com/megvii-research/mdistiller)

\n\n[3] Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation(小波知识蒸馏：迈向高效的图像到图像转换)
\n[paper](https://arxiv.org/abs/2203.06321)

\n\n[2] Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability(知识蒸馏作为高效的预训练：更快的收敛、更高的数据效率和更好的可迁移性)
\n[paper](https://arxiv.org/abs/2203.05180) | [code](https://github.com/CVMI-Lab/KDEP)

\n\n[1] Focal and Global Knowledge Distillation for Detectors(探测器的焦点和全局知识蒸馏)
\nkeywords: Object Detection, Knowledge Distillation
\n[paper](https://arxiv.org/abs/2111.11837) | [code](https://github.com/yzd-v/FGD)

\n\n \n\n### 剪枝(Pruning)\n\n[2] CHEX: CHannel EXploration for CNN Model Compression(CNN模型压缩的通道探索)
\n[paper](https://arxiv.org/abs/2203.15794) | [code](https://github.com/zejiangh/Filter-GaP)

\n\n[1] Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs(空间剪枝：使用自适应滤波器表示来改进稀疏 CNN 的训练)
\n[paper](https://arxiv.org/abs/2203.07808)

\n\n \n\n### 量化(Quantization)\n\n[3] It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher(一切尽在老师身上：零样本量化更贴近老师)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.17008)

\n\n[2] Implicit Feature Decoupling with Depthwise Quantization(使用深度量化的隐式特征解耦)
\n[paper](https://arxiv.org/abs/2203.08080)

\n\n[1] IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization(学习具有类内异质性的合成图像以进行零样本网络量化)
\n[paper](https://arxiv.org/abs/2111.09136) | [code](https://github.com/zysxmu/IntraQ)

\n\n\n\n
\n\n \n\n## 神经网络结构设计(Neural Network Structure Design)\n\n[2] DyRep: Bootstrapping Training with Dynamic Re-parameterization(使用动态重新参数化的引导训练)
\n[paper](https://arxiv.org/abs/2203.12868) | [code](https://github.com/hunto/DyRep)

\n\n[1] BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning(学习探索样本关系以进行鲁棒表征学习)
\nkeywords: sample relationship, data scarcity learning, Contrastive Self-Supervised Learning, long-tailed recognition, zero-shot learning, domain generalization, self-supervised learning
\n[paper](https://arxiv.org/abs/2203.01522) | [code](https://github.com/zhihou7/BatchFormer)

\n\n
\n\n \n\n### CNN\n\n[5] TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing(用于布局感知视觉处理的高效翻译变体卷积)(动态卷积)
\n[paper](https://arxiv.org/abs/2203.10489) | [code](https://github.com/JierunChen/TVConv)

\n\n[4] On the Integration of Self-Attention and Convolution(自注意力和卷积的整合)
\n[paper](https://arxiv.org/abs/2111.14556) | [code1](https://github.com/LeapLabTHU/ACmix) | [code2](https://gitee.com/mindspore/models)

\n\n[3] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs(将内核扩展到 31x31：重新审视 CNN 中的大型内核设计)
\n[paper](https://arxiv.org/abs/2203.06717) | [code](https://github.com/megvii-research/RepLKNet)
\n解读：[凭什么 31x31 大小卷积核的耗时可以和 9x9 卷积差不多？](https://zhuanlan.zhihu.com/p/479182218)
\n解读：[RepLKNet: 大核卷积+结构重参数让CNN再次伟大](https://zhuanlan.zhihu.com/p/480935774)

\n\n[2] DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos(视频中稀疏帧差异的端到端 CNN 推断)
\nkeywords: sparse convolutional neural network, video inference accelerating
\n[paper](https://arxiv.org/abs/2203.03996)
\n\n[1] A ConvNet for the 2020s
\n[paper](https://arxiv.org/abs/2201.03545) | [code](https://github.com/facebookresearch/ConvNeXt)
\n解读：[“文艺复兴” ConvNet卷土重来，压过Transformer！FAIR重新设计纯卷积新架构](https://mp.weixin.qq.com/s/q-s_dV4-TCiVPMOTZKEgPQ)

\n\n
\n\n \n\n### Transformer\n\n[9] Patch Slimming for Efficient Vision Transformers(高效视觉transformer的补丁瘦身)
\n[paper](https://arxiv.org/abs/2106.02852)

\n\n[8] CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance(具有几何制导的基于码本的稀疏体素transformer)
\n[paper](https://arxiv.org/abs/2203.09887)

\n\n[7] MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens(通过操作信使token交换本地空间信息)
\n[paper](https://arxiv.org/abs/2105.15168) | [code](https://github.com/hustvl/MSG-Transformer)

\n\n[6] BoxeR: Box-Attention for 2D and 3D Transformers(用于 2D 和 3D tranformer的 Box-Attention)
\n[paper](https://arxiv.org/abs/2111.13087) | [code](https://github.com/kienduynguyen/BoxeR)

\n\n[5] Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training(引导 ViT：从预训练中解放视觉transformer)
\n[paper](https://arxiv.org/abs/2112.03552) | [code](https://github.com/zhfeing/Bootstrapping-ViTs-pytorch)

\n\n[4] Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-shot Learning
\n[paper](https://arxiv.org/abs/2203.09064) | [code](https://github.com/StomachCold/HCTransformers)

\n\n[3] NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition(在视觉transformer中为视觉识别指定协同上下文)
\n[paper](https://arxiv.org/abs/2111.12994) | [code](https://github.com/TencentYoutuResearch/VisualRecognition-NomMer)

\n\n[2] Delving Deep into the Generalization of Vision Transformers under Distribution Shifts(深入研究分布变化下的视觉Transformer的泛化)
\nkeywords: out-of-distribution (OOD) generalization, Vision Transformers
\n[paper](https://arxiv.org/abs/2106.07617) | [code](https://github.com/Phoenix1153/ViT_OOD_generalization)

\n\n[1] Mobile-Former: Bridging MobileNet and Transformer(连接 MobileNet 和 Transformer)
\nkeywords: Light-weight convolutional neural networks(轻量卷积神经网络),Combination of CNN and ViT
\n[paper](https://arxiv.org/abs/2108.05895)

\n\n
\n\n \n\n### 图神经网络(GNN)\n\n[2] Improving Subgraph Recognition with Variational Graph Information Bottleneck(利用变分图信息瓶颈改进子图识别)
\n[paper](https://arxiv.org/abs/2112.09899) | [code](https://github.com/Samyu0304/VGIB)

\n\n[1] AEGNN: Asynchronous Event-based Graph Neural Networks(基于异步事件的图神经网络)
\n[paper](https://arxiv.org/abs/2203.17149) | [project](https://uzh-rpg.github.io/aegnn/)

\n\n
\n\n \n\n### 神经网络架构搜索(NAS)\n\n[4] Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?(从实用的角度揭开神经切线内核的神秘面纱：无需训练就可以信任神经架构搜索吗？)
\n[paper](https://arxiv.org/abs/2203.14577) | [code](https://github.com/nutellamok/DemystifyingNTK)

\n\n[3] Training-free Transformer Architecture Search(免训练transformer架构搜索)
\n[paper](https://arxiv.org/abs/2203.12217)

\n\n[2] Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning(MAML 的全局收敛和受理论启发的神经架构搜索以进行 Few-Shot 学习)
\n[paper](https://arxiv.org/abs/2203.09137) | [code](https://github.com/YiteWang/MetaNTK-NAS)

\n\n[1] β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search(可微架构搜索的 Beta-Decay 正则化)
\n[paper](https://arxiv.org/abs/2203.01665)

\n\n \n\n### MLP\n\n[5] MAXIM: Multi-Axis MLP for Image Processing(用于图像处理的多轴 MLP)(**Oral**)
\n[papar](https://arxiv.org/abs/2201.02973) | [code](https://github.com/google-research/maxim)

\n\n[4] Brain-inspired Multilayer Perceptron with Spiking Neurons(具有尖峰神经元的类脑多层感知器)
\n[paper](https://arxiv.org/abs/2203.14679) | [code](https://gitee.com/mindspore/models/tree/master/research/cv/snnmlp)

\n\n[3] Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information(利用地理和时间信息进行细粒度图像分类的动态 MLP)
\n[paper](https://arxiv.org/abs/2203.03253) | [code](https://github.com/ylingfeng/DynamicMLP.git)

\n\n[2] Revisiting the Transferability of Supervised Pretraining: an MLP Perspective(重新审视监督预训练的可迁移性：MLP 视角)
\n[paper](https://arxiv.org/abs/2112.00496)

\n\n[1] An Image Patch is a Wave: Quantum Inspired Vision MLP(图像补丁是波浪：量子启发的视觉 MLP)
\n[paper](https://arxiv.org/abs/2111.12294) | [code](https://github.com/huawei-noah/CV-Backbones/tree/master/wavemlp_pytorch) | [code](https://gitee.com/mindspore/models/tree/master/research/cv/wave_mlp)

\n\n\n
\n\n \n\n## 数据处理(Data Processing)\n\n[2] Generating High Fidelity Data from Low-density Regions using Diffusion Models(使用扩散模型从低密度区域生成高保真数据)
\n[paper](https://arxiv.org/abs/2203.17260)

\n\n[1] Dataset Distillation by Matching Training Trajectories(通过匹配训练轨迹进行蒸馏)(**数据集蒸馏**)
\n[paper](https://arxiv.org/abs/2203.11932) | [code](https://github.com/GeorgeCazenavette/mtt-distillation) | [project](https://georgecazenavette.github.io/mtt-distillation/)

\n\n \n\n### 数据增广(Data Augmentation)\n\n\n[3] EnvEdit: Environment Editing for Vision-and-Language Navigation(视觉语言导航的环境编辑)
\n[paper](https://arxiv.org/abs/2203.15685) | [code](https://github.com/jialuli-luka/EnvEdit)

\n\n[2] TeachAugment: Data Augmentation Optimization Using Teacher Knowledge(使用教师知识进行数据增强优化)
\n[paper](https://arxiv.org/abs/2202.12513) ｜ [code](https://github.com/DensoITLab/TeachAugment)

\n\n[1] 3D Common Corruptions and Data Augmentation(3D 常见损坏和数据增强)(**Oral**)
\nkeywords: Data Augmentation, Image restoration, Photorealistic image synthesis
\n[paper](https://arxiv.org/abs/2203.01441) | [projecr](https://3dcommoncorruptions.epfl.ch/)

\n\n\n\n
\n\n \n\n### 归一化/正则化(Batch Normalization)\n\n[1] Delving into the Estimation Shift of Batch Normalization in a Network(深入研究网络中批量标准化的估计偏移)
\n[paper](https://arxiv.org/abs/2203.10778) | [code](https://github.com/huangleiBuaa/XBNBlock)

\n\n
\n\n \n\n### 图像聚类(Image Clustering)\n\n[1] RAMA: A Rapid Multicut Algorithm on GPU(GPU 上的快速多切算法)
\n[paper](https://arxiv.org/abs/2109.01838) | [code](https://github.com/pawelswoboda/RAMA)

\n\n
\n\n\n \n\n### 图像压缩(Image Compression)\n\n[4] Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression(用于高效神经图像压缩的统一多元高斯混合)
\n[paper](https://arxiv.org/abs/2203.10897) | [code](https://github.com/xiaosu-zhu/McQuic)

\n\n[3] ELIC: Efficient Learned Image Compression with Unevenly Grouped Space-Channel Contextual Adaptive Coding(具有不均匀分组的空间通道上下文自适应编码的高效学习图像压缩)
\n[paper](https://arxiv.org/abs/2203.10886)

\n\n[2] The Devil Is in the Details: Window-based Attention for Image Compression(细节中的魔鬼：图像压缩的基于窗口的注意力)
\n[paper](https://arxiv.org/abs/2203.08450) | [code](https://github.com/Googolxx/STF)

\n\n[1] Neural Data-Dependent Transform for Learned Image Compression(用于学习图像压缩的神经数据相关变换)
\n[paper](https://arxiv.org/abs/2203.04963) | [code](https://dezhao-wang.github.io/Neural-Syntax-Website/) | [project](https://dezhao-wang.github.io/Neural-Syntax-Website/)

\n\n\n\n
\n\n \n\n## 模型训练/泛化(Model Training/Generalization)\n\n[11] Parameter-free Online Test-time Adaptation(无参数在线测试时间自适应)(**Oral**)
\n[paper](https://arxiv.org/abs/2201.05718) | [code](https://github.com/fiveai/LAME)

\n\n[10] SNUG: Self-Supervised Neural Dynamic Garments(自我监督的神经动态服装)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.02219) | [project](http://mslab.es/projects/SNUG/)

\n\n[9] Automated Progressive Learning for Efficient Training of Vision Transformers(用于高效训练视觉transformer的自动渐进式学习)
\n[paper](https://arxiv.org/abs/2203.14509) | [code](https://github.com/changlin31/AutoProg)

\n\n[8] GradViT: Gradient Inversion of Vision Transformers(视觉transformer的梯度反转)
\n[paper](https://arxiv.org/abs/2203.11894) | [project](https://gradvit.github.io/)

\n\n[7] Recall@k Surrogate Loss with Large Batches and Similarity Mixup(大批量和相似性混合的 Recall@k 代理损失)
\n[paper](https://arxiv.org/abs/2108.11179)

\n\n[6] Out-of-distribution Generalization with Causal Invariant Transformations(具有因果不变变换的分布外泛化)
\n[paper](https://arxiv.org/abs/2203.11528)

\n\n[5] Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective(神经网络可以两次学习相同的模型吗？从决策边界的角度研究可重复性和双重下降)
\n[paper](https://arxiv.org/abs/2203.08124) | [code](https://github.com/somepago/dbViz)

\n\n[4] Towards Efficient and Scalable Sharpness-Aware Minimization(迈向高效和可扩展的锐度感知最小化)
\nkeywords: Sharp Local Minima, Large-Batch Training
\n[paper](https://arxiv.org/abs/2203.02714)

\n\n[3] CAFE: Learning to Condense Dataset by Aligning Features(通过对齐特征学习压缩数据集)
\nkeywords: dataset condensation, coreset selection, generative models
\n[paper](https://arxiv.org/pdf/2203.01531.pdf) | [code](https://github.com/kaiwang960112/CAFE)

\n\n[2] The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration(魔鬼在边缘：用于网络校准的基于边缘的标签平滑)
\n[paper](https://arxiv.org/abs/2111.15430) | [code](https://github.com/by-liu/MbLS)

\n\n[1] DN-DETR: Accelerate DETR Training by Introducing Query DeNoising(通过引入查询去噪加速 DETR 训练)
\nkeywords: Detection Transformer
\n[paper](https://arxiv.org/abs/2203.01305) | [code](https://github.com/FengLi-ust/DN-DETR)

\n\n
\n\n \n\n### 噪声标签(Noisy Label)\n\n[3] UNICON: Combating Label Noise Through Uniform Selection and Contrastive Learning(通过统一选择和对比学习来对抗标签噪声)
\n[paper](https://arxiv.org/abs/2203.14542) | [code](https://github.com/nazmul-karim170/UNICON-Noisy-Label)

\n\n[2] Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels(带有噪声标签的学习中噪声检测的可扩展惩罚回归)
\n[paper](https://arxiv.org/abs/2203.07788) | [code](https://github.com/Yikai-Wang/SPR-LNL)

\n\n[1] Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels(Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels)
\n[paper](https://arxiv.org/abs/2203.07788) | [code](https://github.com/Yikai-Wang/SPR-LNL)

\n\n\n\n
\n\n \n\n### 长尾分布(Long-Tailed Distribution)\n\n[1] Targeted Supervised Contrastive Learning for Long-Tailed Recognition(用于长尾识别的有针对性的监督对比学习)
\nkeywords: Long-Tailed Recognition(长尾识别), Contrastive Learning(对比学习)
\n[paper](https://arxiv.org/pdf/2111.13998.pdf)

\n\n\n
\n\n \n\n\n## 图像特征提取与匹配(Image feature extraction and matching)\n\n[1] Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences(弱监督语义对应的概率扭曲一致性)
\n[paper](https://arxiv.org/abs/2203.04279) | [code](https://github.com/PruneTruong/DenseMatching)

\n\n
\n\n \n\n## 视觉表征学习(Visual Representation Learning)\n\n[4] Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization(通过节点到邻域互信息最大化的图中节点表示学习)
\n[paper](https://arxiv.org/abs/2203.12265) | [code](https://github.com/dongwei156/n2n)

\n\n[3] SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization(通过相似性感知归一化探索场景文本的自监督表示学习)
\n[paper](https://arxiv.org/abs/2203.10492)

\n\n[2] Exploring Set Similarity for Dense Self-supervised Representation Learning(探索密集自监督表示学习的集合相似性)
\n[paper](https://arxiv.org/abs/2107.08712)

\n\n[1] Motion-aware Contrastive Video Representation Learning via Foreground-background Merging(通过前景-背景合并的运动感知对比视频表示学习)
\n[paper](https://arxiv.org/abs/2109.15130) | [code](https://github.com/Mark12Ding/FAME)

\n\n
\n\n \n\n## 模型评估(Model Evaluation)\n\n\n\n
\n\n \n\n## 多模态学习(Multi-Modal Learning)\n\n[1] MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound(通过视觉、语言和声音的神经脚本知识)
\n[paper](https://arxiv.org/abs/2201.02639) | [project](https://rowanzellers.com/merlotreserve)

\n\n
\n\n \n\n### 视听学习(Audio-visual Learning)\n\n[4] Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language(具有跨模态注意力和语言的视听广义零样本学习)
\n[paper](https://arxiv.org/abs/2203.03598) | [code](https://github.com/ExplainableML/AVCA-GZSL)

\n\n[3] Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes(自监督预测学习：视觉场景中声源定位的无负法方法)(**视觉定位**)
\n[paper](https://arxiv.org/abs/2203.13412) | [code](https://github.com/zjsong/SSPL)

\n\n[2] Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation(用于协同语音手势生成的学习分层跨模式关联)
\n[paper](https://arxiv.org/abs/2203.13161) | [project](https://alvinliu0.github.io/projects/HA2G)

\n\n[1] UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection(用于联合视频时刻检索和高光检测的统一多模态transformer)
\n[paper](https://arxiv.org/abs/2203.12745) | [code](https://github.com/TencentARC/UMT)

\n\n\n
\n\n \n\n### 视觉-语言（Vision-language）\n\n[14] DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation(用于鲁棒图像处理的文本引导扩散模型)
\n[paper](https://arxiv.org/abs/2110.02711) | [code](https://github.com/gwang-kim/DiffusionCLIP.git)

\n\n[13] StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis(走向合成和高保真文本到图像的合成)
\n[paper](https://arxiv.org/abs/2203.15799)

\n\n[12] LiT: Zero-Shot Transfer with Locked-image text Tuning(带锁定图像文本调整的零样本迁移)
\n[paper](https://arxiv.org/abs/2111.07991)

\n\n[11] VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks(视觉和语言任务的参数高效迁移学习)
\n[paper](https://arxiv.org/abs/2112.06825) | [code](https://github.com/ylsung/VL_adapter)

\n\n[10] Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model(预测、预防和评估：由预训练的视觉语言模型支持的解耦的文本驱动图像处理)
\n[paper](https://arxiv.org/abs/2111.13333) | [code](https://github.com/zipengxuc/PPE)

\n\n[9] LAFITE: Towards Language-Free Training for Text-to-Image Generation(面向文本到图像生成的无语言培训)
\n[paper](https://arxiv.org/abs/2111.13792) | [code](https://github.com/drboog/Lafite)

\n\n[8] An Empirical Study of Training End-to-End Vision-and-Language Transformers(培训端到端视觉和语言transformer的实证研究)
\n[paper](https://arxiv.org/abs/2111.02387) | [code](https://github.com/zdou0830/METER)

\n\n[7] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding(为视觉基础生成伪语言查询)
\n[paper](https://arxiv.org/abs/2203.08481) | [code](https://github.com/LeapLabTHU/Pseudo-Q)

\n\n[6] Conditional Prompt Learning for Vision-Language Models(视觉语言模型的条件提示学习)
\n[paper](https://arxiv.org/abs/2203.05557) | [code](https://github.com/KaiyangZhou/CoOp)

\n\n[5] NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks(视觉和视觉语言任务中的自然语言解释模型)
\n[paper](https://arxiv.org/abs/2203.05081) | [code](https://github.com/fawazsammani/nlxgpt)

\n\n[4] L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) **(Oral Presentation)**
\n[paper](https://arxiv.org/abs/2111.11133)

\n\n[3] HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)
\nkeywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks
\n[paper](https://arxiv.org/abs/2112.05142) | [project](https://github.com/wty-ustc/HairCLIP)

\n\n[2] CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)
\nkeywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)
\n[paper](https://arxiv.org/abs/2112.05139) | [code](https://cassiepython.github.io/clipnerf/)

\n\n[1] Vision-Language Pre-Training with Triple Contrastive Learning(三重对比学习的视觉语言预训练)
\nkeywords: Vision-language representation learning, Contrastive Learning\n[paper](https://arxiv.org/abs/2202.10401) | [code](https://github.com/uta-smile/TCL;)

\n\n\n
\n \n\n## 视觉预测(Vision-based Prediction)\n\n[12] Multi-Person Extreme Motion Prediction(多人极限运动预测)
\n[paper](https://arxiv.org/abs/2105.08825) | [code and dataset](https://team.inria.fr/robotlearn/multi-person-extreme-motion- prediction/)

\n\n[11] Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos(以自我为中心的视频的联合手部运动和交互热点预测)
\n[paper](https://arxiv.org/abs/2204.01696) | [project](https://stevenlsw.github.io/hoi-forecast)

\n\n[10] Vehicle trajectory prediction works, but not everywhere(车辆轨迹预测有效，但并非无处不在)
\n[paper](https://arxiv.org/abs/2112.03909) | [code](https://s-attack.github.io/)

\n\n[9] Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion(基于运动不确定性扩散的随机轨迹预测)
\n[paper](https://arxiv.org/abs/2203.13777) | [code](https://github.com/gutianpei/MID)

\n\n[8] Non-Probability Sampling Network for Stochastic Human Trajectory Prediction(用于随机人体轨迹预测的非概率采样网络)
\n[paper](https://arxiv.org/abs/2203.13471) | [code](https://github.com/inhwanbae/NPSN)

\n\n[7] Remember Intentions: Retrospective-Memory-based Trajectory Prediction(记住意图：基于回顾性记忆的轨迹预测)
\n[paper](https://arxiv.org/abs/2203.11474) | [code](https://github.com/MediaBrain-SJTU/MemoNet)

\n\n[6] GaTector: A Unified Framework for Gaze Object Prediction(凝视对象预测的统一框架)
\n[paper](https://arxiv.org/abs/2112.03549)

\n\n[5] On Adversarial Robustness of Trajectory Prediction for Autonomous Vehicles(自动驾驶汽车轨迹预测的对抗鲁棒性)
\n[paper](https://arxiv.org/abs/2201.05057) | [code](https://github.com/zqzqz/AdvTrajectoryPrediction)

\n\n[4] Adaptive Trajectory Prediction via Transferable GNN(基于可迁移 GNN 的自适应轨迹预测)
\n[paper](https://arxiv.org/abs/2203.05046)

\n\n[3] Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective(迈向稳健和自适应运动预测：因果表示视角)
\n[paper](https://arxiv.org/abs/2111.14820) | [code](https://github.com/vita-epfl/causalmotion)

\n\n[2] How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting(多少个观察就足够了？轨迹预测的知识蒸馏)
\nkeywords: Knowledge Distillation, trajectory forecasting
\n[paper](https://arxiv.org/abs/2203.04781)

\n\n[1] Motron: Multimodal Probabilistic Human Motion Forecasting(多模式概率人体运动预测)
\n[paper](https://arxiv.org/abs/2203.04132)

\n\n\n\n
\n \n\n## 数据集(Dataset)\n\n[18] Pyramid Grafting Network for One-Stage High Resolution Saliency Detection(用于单阶段高分辨率显着性检测的金字塔嫁接网络)
\n[paper](https://arxiv.org/abs/2204.05041) | [dataset](https://github.com/iCVTEAM/PGNet)

\n\n[17] TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting(使用 Transformer 编码多尺度时间相关性以进行重复动作计数)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.01018) | [dataset](https://svip-lab.github.io/dataset/RepCount_dataset.html) | [code](https://github.com/SvipRepetitionCounting/TransRAC)

\n\n[16] Multi-Person Extreme Motion Prediction(多人极限运动预测)(人体交互数据集)
\n[paper](https://arxiv.org/abs/2105.08825) | [code and dataset](https://team.inria.fr/robotlearn/multi-person-extreme-motion- prediction/)

\n\n[15] ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer(用于 Sim2Real 传输的多感官对象数据集)
\n[paper](https://arxiv.org/abs/2204.02389) | [project](https://ai.stanford.edu/~rhgao/objectfolder2.0/) | [dataset](https://github.com/rhgao/ObjectFolder)

\n\n[14] Rethinking Visual Geo-localization for Large-Scale Applications(重新思考大规模应用程序的视觉地理定位)
\n[paper](https://arxiv.org/abs/2204.02287) | [Dataset, code and trained models](https://github.com/gmberton/CosPlace)

\n\n[13] Deep Image-based Illumination Harmonization(基于深度图像的照明协调)
\n[paper](https://arxiv.org/abs/2108.00150) | [dataset](https://github.com/zhongyunbao/Dataset)

\n\n[12] OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction(理解手物交互的大规模知识库)
\n[paper](https://arxiv.org/abs/2203.15709) | [datasets&code](https://github.com/lixiny/OakInk)

\n\n[11] Instance-wise Occlusion and Depth Orders in Natural Scenes(自然场景中的实例遮挡和深度顺序)
\n[paper](https://arxiv.org/abs/2111.14562) | [code](https://github.com/POSTECH-CVLab/InstaOrder)

\n\n[10] Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities(用于理解程序活动的大规模多视图视频数据集)
\n[paper](https://arxiv.org/abs/2203.14712) | [project](https://assembly101.github.io/)

\n\n[9] Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task(用于自动驾驶和单目 3D 目标检测任务的路边感知数据集)
\n[paper](https://arxiv.org/abs/2203.13608) | [dataset](https://thudair.baai.ac.cn/rope)

\n\n[8] DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation(用于语义变化分割的每日多光谱卫星数据集)
\n[paper](https://arxiv.org/abs/2203.12560) | [data](https://mediatum.ub.tum.de/1650201) | [website](https://codalab.lisn.upsaclay.fr/competitions/2882)

\n\n[7] Egocentric Prediction of Action Target in 3D(以自我为中心的 3D 行动目标预测)(**机器人**)
\n[paper](https://arxiv.org/abs/2203.13116) | [project](https://ai4ce.github.io/EgoPAT3D/)

\n\n[6] M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining(电子商务多模态预训练的自协调对比学习)(多模态预训练数据集)
\n[paper](https://arxiv.org/abs/2109.04275)

\n\n[5] FERV39k: A Large-Scale Multi-Scene Dataset for Facial Expression Recognition in Videos(用于视频中面部表情识别的大规模多场景数据集)
\n[paper](https://arxiv.org/abs/2203.09463)

\n\n[4] Ego4D: Around the World in 3,000 Hours of Egocentric Video(3000 小时以自我为中心的视频环游世界)
\n[paper](https://arxiv.org/abs/2110.07058) | [project](https://ego4d-data.org/)

\n\n[3] GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains(用于细粒度和域自适应识别谷物的大规模数据集)
\n[paper](https://arxiv.org/abs/2203.05306) | [dataset](https://github.com/hellodfan/GrainSpace)

\n\n[2] Kubric: A scalable dataset generator(Kubric：可扩展的数据集生成器)
\n[paper](https://arxiv.org/abs/2203.03570) | [code](https://github.com/google-research/kubric)

\n\n[1] A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection(用于分段级视频复制检测的大规模综合数据集和复制重叠感知评估协议)
\nVCSL (Video Copy Segment Localization) dataset
\n[paper](https://arxiv.org/abs/2203.02654) | [dataset, metric and benchmark codes](https://github.com/alipay/VCSL)\n\n\n\n
\n\n \n\n## 主动学习(Active Learning)\n\n[1] Active Learning by Feature Mixing(通过特征混合进行主动学习)
\n[paper](https://arxiv.org/abs/2203.07034) | [code](https://github.com/Haoqing-Wang/InfoCL)

\n\n\n\n
\n\n \n\n## 小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)\n\n[4] Integrative Few-Shot Learning for Classification and Segmentation(用于分类和分割的集成小样本学习)
\n[paper](https://arxiv.org/abs/2203.15712)

\n\n[3] Ranking Distance Calibration for Cross-Domain Few-Shot Learning(跨域小样本学习的排名距离校准)
\n[paper](https://arxiv.org/abs/2112.00260)

\n\n[2] Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification(小样本分类的相互集中学习)
\n[paper](https://arxiv.org/abs/2106.05517)

\n\n[1] MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning(用于零样本学习的相互语义蒸馏网络)
\nkeywords: Zero-Shot Learning, Knowledge Distillation
\n[paper](https://arxiv.org/abs/2203.03137) | [code](https://github.com/shiming-chen/MSDN)

\n\n\n\n
\n\n \n\n## 持续学习(Continual Learning/Life-long Learning)\n\n[5] GCR: Gradient Coreset Based Replay Buffer Selection For Continual Learning(用于持续学习的基于梯度核心集的重放缓冲区选择)
\n[paper](https://arxiv.org/abs/2111.11210)

\n\n[4] Probing Representation Forgetting in Supervised and Unsupervised Continual Learning(探索有监督和无监督持续学习中的表征遗忘)
\n[paper](https://arxiv.org/abs/2203.13381)

\n\n[3] Meta-attention for ViT-backed Continual Learning(ViT 支持的持续学习的元注意力)
\n[paper](https://arxiv.org/abs/2203.11684) | [code](https://github.com/zju-vipa/MEAT-TIL)

\n\n[2] Learning to Prompt for Continual Learning(学习提示持续学习)
\n[paper](https://arxiv.org/abs/2112.08654) | [code](https://github.com/google-research/l2p)

\n\n[1] On Generalizing Beyond Domains in Cross-Domain Continual Learning(关于跨域持续学习中的域外泛化)
\n[paper](https://arxiv.org/abs/2203.03970)

\n\n
\n\n \n\n## 场景图(Scene Graph)\n\n[1] Continuous Scene Representations for Embodied AI(具身 AI 的连续场景表示)
\n[paper](https://arxiv.org/abs/2203.17251) | [project](https://prior.allenai.org/projects/csr) | [code](https://github.com/allenai/CSR) | [video](https://youtu.be/XdNIcEz_8Tw)

\n\n \n\n### 场景图生成(Scene Graph Generation)\n\n\n[2] Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation(用于无偏场景图生成的堆叠混合注意力和组协作学习)
\n[paper](https://arxiv.org/abs/2203.09811) | [code](https://github.com/dongxingning/SHA-GCL-for-SGG)

\n\n[1] Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs(将视频场景图重新格式化为时间二分图)
\nkeywords: Video Scene Graph Generation, Transformer, Video Grounding
\n[paper](https://arxiv.org/abs/2112.04222) | [code](https://github.com/Dawn-LX/VidVRD-tracklets)

\n\n
\n\n \n\n### 场景图预测(Scene Graph Prediction)\n\n
\n\n \n\n### 场景图理解(Scene Graph Understanding)\n\n
\n\n \n\n## 视觉定位/位姿估计(Visual Localization/Pose Estimation)\n\n[16] ES6D: A Computation Efficient and Symmetry-Aware 6D Pose Regression Framework(一种计算效率高且具有对称性的 6D 姿势回归框架)
\n[paper](https://arxiv.org/abs/2204.01080) | [code](https://github.com/GANWANSHUI/ES6D.git)

\n\n[15] Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions(重新审视 3D 对象姿态估计的模板：对新对象的泛化和对遮挡的鲁棒性)
\n[paper](https://arxiv.org/abs/2203.17234) | [code](https://github.com/nv-nguyen/template-pose)

\n\n[14] OSOP: A Multi-Stage One Shot Object Pose Estimation Framework(多阶段 One Shot 对象姿态估计框架)
\n[paper](https://arxiv.org/abs/2203.15533)

\n\n[13] Putting People in their Place: Monocular Regression of 3D People in Depth(3D 人物深度的单目回归)
\n[paper](https://arxiv.org/abs/2112.08274) | [code](https://github.com/Arthur151/ROMP;) | [Dataset](https://github.com/Arthur151/Relative_Human)

\n\n[12] FS6D: Few-Shot 6D Pose Estimation of Novel Objects(新物体的小样本 6D 姿态估计)
\n[paper](https://arxiv.org/abs/2203.14628) | [project](https://fs6d.github.io/)

\n\n[11] Uni6D: A Unified CNN Framework without Projection Breakdown for 6D Pose Estimation(用于 6D 姿势估计的无投影分解的统一 CNN 框架)
\n[paper](https://arxiv.org/abs/2203.14531)

\n\n[10] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation(用于单目物体姿态估计的广义端到端概率透视-n-点)
\n[paper](https://arxiv.org/abs/2203.13254)

\n\n[9] RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization(具有鲁棒对应场估计和姿态优化的递归 6-DoF 对象姿态细化)
\n[paper](https://arxiv.org/abs/2203.12870) | [code](https://github.com/DecaYale/RNNPose)

\n\n[8] DiffPoseNet: Direct Differentiable Camera Pose Estimation(直接可微分相机位姿估计)
\n[paper](https://arxiv.org/abs/2203.11174)

\n\n[7] ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation(用于 6DoF 对象姿态估计的粗到细表面编码)
\n[paper](https://arxiv.org/abs/2203.09418)

\n\n[6] Object Localization under Single Coarse Point Supervision(单粗点监督下的目标定位)
\n[paper](https://arxiv.org/abs/2203.09338) | [code](https://github.com/ucas-vg/PointTinyBenchmark/)

\n\n[5] CrossLoc: Scalable Aerial Localization Assisted by Multimodal Synthetic Data(多模式合成数据辅助的可扩展空中定位)
\n[paper](https://arxiv.org/abs/2112.09081) | [code](https://github.com/TOPO-EPFL/CrossLoc)

\n\n[4] GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting(通过几何引导的逐点投票进行类别级对象位姿估计)
\n[paper](https://arxiv.org/abs/2203.07918) | [code](https://github.com/lolrudy/GPV_Pose)

\n\n[3] CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild(CPPF：在野外实现稳健的类别级 9D 位姿估计)
\n[paper](https://arxiv.org/abs/2203.03089) | [code](https://github.com/qq456cvb/CPPF)

\n\n[2] OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation(用于基于深度的 6D 对象位姿估计的对象视点编码)
\n[paper](https://arxiv.org/abs/2203.01072) | [code](https://github.com/dingdingcai/OVE6D-pose)

\n\n[1] Spatial Commonsense Graph for Object Localisation in Partial Scenes(局部场景中对象定位的空间常识图)
\n[paper](https://arxiv.org/abs/2203.05380) | [code](https://github.com/FGiuliari/SpatialCommonsenseGraph-Dataset) | [project](http://fgiuliari.github.io/projects/SpatialCommonsenseGraph/)

\n\n
\n\n \n\n## 视觉推理/视觉问答(Visual Reasoning/VQA)\n\n[5] SimVQA: Exploring Simulated Environments for Visual Question Answering(探索视觉问答的模拟环境)
\n[paper](https://arxiv.org/abs/2203.17219) | [project](https://simvqa.github.io/)

\n\n[4] Learning to Answer Questions in Dynamic Audio-Visual Scenarios(学习在动态视听场景中回答问题)(视听学习)
\n[paper](https://arxiv.org/abs/2203.14072) | [code](http://gewu-lab.github.io/MUSIC-AVQA/)

\n\n[3] Visual Abductive Reasoning(视觉溯因推理)
\n[paper](https://arxiv.org/abs/2203.14040) | [code](https://github.com/leonnnop/VAR)

\n\n[2] MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering(基于知识的视觉问答的多模态知识提取与积累)
\n[paper](https://arxiv.org/abs/2203.09138) | [code](https://github.com/AndersonStra/MuKEA)

\n\n[1] REX: Reasoning-aware and Grounded Explanation(推理意识和扎根的解释)
\n[paper](https://arxiv.org/abs/2203.06107) | [code](https://github.com/szzexpoi/rex)

\n\n
\n\n \n\n## 图像分类(Image Classification)\n\n[2] CAD: Co-Adapting Discriminative Features for Improved Few-Shot Classification(共同适应判别特征以改进小样本分类)
\n[paper](https://arxiv.org/abs/2203.13465)

\n\n[1] GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction(用于多类别属性预测的基于全局、局部和内在的密集嵌入网络)
\nkeywords: multi-label classification
\n[paper](https://arxiv.org/abs/2203.03079) | [code](https://github.com/kareem-metwaly/glidenet) | [project](http://signal.ee.psu.edu/research/glidenet.html)

\n\n
\n\n \n\n## 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)\n\n[10] Transferability Estimation using Bhattacharyya Class Separability(使用 Bhattacharyya 类可分离性的可迁移性估计)
\n[paper](https://arxiv.org/abs/2111.12780)

\n\n[9] The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization(通过归一化进行动态无监督域自适应)
\n[paper](https://arxiv.org/abs/2112.00463) | [code](https://github.com/jmiemirza/DUA)

\n\n[8] Continual Test-Time Domain Adaptation(持续测试时域适应)
\n[paper](https://arxiv.org/abs/2203.13591) | [code](https://qin.ee/cotta)

\n\n[7] Compound Domain Generalization via Meta-Knowledge Encoding(基于元知识编码的复合域泛化)
\n[paper](https://arxiv.org/abs/2203.13006)

\n\n[6] Learning Affordance Grounding from Exocentric Images(从离中心图像中学习可供性基础)
\n[paper](https://arxiv.org/abs/2203.09905) | [code](http://github.com/lhc1224/Cross-View-AG)

\n\n[5] Category Contrast for Unsupervised Domain Adaptation in Visual Tasks(视觉任务中无监督域适应的类别对比)
\n[paper](https://arxiv.org/abs/2106.02885)

\n\n[4] Learning Distinctive Margin toward Active Domain Adaptation(向主动领域适应学习独特的边际)
\n[paper](https://arxiv.org/abs/2203.05738) | [code](https://github.com/TencentYoutuResearch/ActiveLearning-SDM)

\n\n[3] How Well Do Sparse Imagenet Models Transfer?(稀疏 Imagenet 模型的迁移效果如何？)
\n[paper](https://arxiv.org/abs/2111.13445)

\n\n[2] A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation(用于手语翻译的简单多模态迁移学习基线)
\n[paper](https://arxiv.org/abs/2203.04287)

\n\n[1] Weakly Supervised Object Localization as Domain Adaption(作为域适应的弱监督对象定位)
\nkeywords: Weakly Supervised Object Localization(WSOL), Multi-instance learning based WSOL, Separated-structure based WSOL, Domain Adaption
\n[paper](https://arxiv.org/abs/2203.01714) | [code](https://github.com/zh460045050/DA-WSOL_CVPR2022)

\n\n
\n\n \n\n## 度量学习(Metric Learning)\n\n[4] Hyperbolic Vision Transformers: Combining Improvements in Metric Learning(双曲线视觉transformer：结合度量学习的改进)
\n[paper](https://arxiv.org/abs/2203.10833) | [code](https://github.com/htdt/hyp_metric)

\n\n[3] Non-isotropy Regularization for Proxy-based Deep Metric Learning(基于代理的深度度量学习的非各向同性正则化)
\n[paper](https://arxiv.org/abs/2203.08547) | [code](https://github.com/ExplainableML/NonIsotropicProxyDML)

\n\n[2] Integrating Language Guidance into Vision-based Deep Metric Learning(将语言指导集成到基于视觉的深度度量学习中)
\n[paper](https://arxiv.org/abs/2203.08543) | [code](https://github.com/ExplainableML/LanguageGuidance_for_DML)

\n\n[1] Enhancing Adversarial Robustness for Deep Metric Learning(增强深度度量学习的对抗鲁棒性)
\nkeywords: Adversarial Attack, Adversarial Defense, Deep Metric Learning
\n[paper](https://arxiv.org/pdf/2203.01439.pdf)

\n\n
\n\n \n\n## 对比学习(Contrastive Learning)\n\n[6] Versatile Multi-Modal Pre-Training for Human-Centric Perception(用于以人为中心的感知的多功能多模态预训练)
\n[paper](https://arxiv.org/abs/2203.13815) | [project](https://hongfz16.github.io/projects/HCMoCo.html;) | [code](https://github.com/hongfz16/HCMoCo)

\n\n[5] Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation(用于弱监督对象定位和语义分割的类不可知激活图的对比学习)
\n[paper](https://arxiv.org/abs/2203.13505) | [code](https://github.com/CVI- SZU/CCAM)

\n\n[4] Rethinking Minimal Sufficient Representation in Contrastive Learning(重新思考对比学习中的最小充分表示)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.07004) | [code](https://github.com/Haoqing-Wang/InfoCL)

\n\n[3] Selective-Supervised Contrastive Learning with Noisy Labels(带有噪声标签的选择性监督对比学习)
\n[paper](https://arxiv.org/abs/2203.04181) | [code](https://github.com/ShikunLi/Sel-CL)

\n\n[2] HCSC: Hierarchical Contrastive Selective Coding(分层对比选择性编码)
\nkeywords: Self-supervised Representation Learning, Deep Clustering, Contrastive Learning
\n[paper](https://arxiv.org/abs/2202.00455) | [code](https://github.com/gyfastas/HCSC)

\n\n[1] Crafting Better Contrastive Views for Siamese Representation Learning(为连体表示学习制作更好的对比视图)
\n[paper](https://arxiv.org/pdf/2202.03278.pdf) | [code](https://github.com/xyupeng/ContrastiveCrop)

\n\n
\n\n \n\n## 增量学习(Incremental Learning)\n\n[3] Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning(类增量学习的初始阶段去相关方法)
\n[paper](https://arxiv.org/abs/2112.04731) | [code](https://github.com/Yujun-Shi/CwD)

\n\n[2] Forward Compatible Few-Shot Class-Incremental Learning(前后兼容的小样本类增量学习)
\n[paper](https://arxiv.org/abs/2203.06953) | [code](https://github.com/zhoudw-zdw/CVPR22-Fact)

\n\n[1] Self-Sustaining Representation Expansion for Non-Exemplar Class-Incremental Learning(非示例类增量学习的自我维持表示扩展)
\n[paper](https://arxiv.org/abs/2203.06359)

\n\n
\n\n \n\n## 强化学习(Reinforcement Learning)\n\n[1] Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory(具有编排记忆的演员评论家 GPT 的 3D 舞蹈生成)
\n[paper](https://arxiv.org/abs/2203.13055) | [code](https://github.com/lisiyao21/Bailando/)

\n\n
\n\n \n\n## 元学习(Meta Learning)\n\n[3] A Structured Dictionary Perspective on Implicit Neural Representations(隐式神经表示的结构化字典视角)
\n[paper](https://arxiv.org/abs/2112.01917) | [code](https://github.com/gortizji/inr_dictionaries)

\n\n[2] Multidimensional Belief Quantification for Label-Efficient Meta-Learning(标签高效元学习的多维信念量化)
\n[paper](https://arxiv.org/abs/2203.12768)

\n\n[1] What Matters For Meta-Learning Vision Regression Tasks?(元学习视觉回归任务的重要性是什么？)
\n[paper](https://arxiv.org/abs/2203.04905)

\n\n
\n\n \n\n## 机器人(Robotic)\n\n\n[2] Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation(通过离散化实现视觉机器人操作的高效学习)
\n[paper](https://arxiv.org/abs/2106.12534) | [code](https://github.com/stepjam/ARM) | [project](https://sites.google.com/view/c2f-q-attention)

\n\n[1] IFOR: Iterative Flow Minimization for Robotic Object Rearrangement(IFOR：机器人对象重排的迭代流最小化)
\n[paper](https://arxiv.org/pdf/2202.00732.pdf) | [project](https://imankgoyal.github.io/ifor.html)

\n\n
\n\n \n\n## 半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)\n\n[8] When Does Contrastive Visual Representation Learning Work?(对比视觉表征学习何时起作用)
\n[paper](https://arxiv.org/abs/2105.05837)

\n\n[7] Leverage Your Local and Global Representations: A New Self-Supervised Learning Strategy(利用局部和全局表征：一种新的自我监督学习策略)
\n[paper](https://arxiv.org/abs/2203.17205)

\n\n[6] Decoupling Makes Weakly Supervised Local Feature Better(解耦使弱监督的局部特征更好)
\n[paper](https://arxiv.org/abs/2201.02861) | [code](https://github.com/The-Learning-And-Vision-Atelier-LAVA/PoSFeat)

\n\n[5] SimMatch: Semi-supervised Learning with Similarity Matching(具有相似性匹配的半监督学习)
\n[paper](https://arxiv.org/abs/2203.06915) | [code](https://github.com/KyleZheng1997/simmatch)

\n\n[4] Robust Equivariant Imaging: a fully unsupervised framework for learning to image from noisy and partial measurements(一个完全无监督的框架，用于从噪声和部分测量中学习图像)
\n[paper](https://arxiv.org/abs/2111.12855) | [code](https://github.com/edongdongchen/REI)

\n\n[3] UniVIP: A Unified Framework for Self-Supervised Visual Pre-training(自监督视觉预训练的统一框架)
\n[paper](https://arxiv.org/abs/2203.06965)

\n\n[2] Class-Aware Contrastive Semi-Supervised Learning(类感知对比半监督学习)
\nkeywords: Semi-Supervised Learning, Self-Supervised Learning, Real-World Unlabeled Data Learning
\n[paper](https://arxiv.org/abs/2203.02261)

\n\n[1] A study on the distribution of social biases in self-supervised learning visual models(自监督学习视觉模型中social biases分布的研究)
\n[paper](https://arxiv.org/pdf/2203.01854.pdf)

\n\n
\n\n \n\n## 神经网络可解释性(Neural Network Interpretability)\n\n[2] Do Explanations Explain? Model Knows Best(解释解释吗？模型最清楚)
\n[paper](https://arxiv.org/abs/2203.02269)

\n\n[1] Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks(神经网络中可解释的部分-整体层次结构和概念语义关系)
\n[paper](https://arxiv.org/abs/2203.03282)

\n\n
\n\n \n\n\n## 图像计数(Image Counting)\n\n[3] DR.VIC: Decomposition and Reasoning for Video Individual Counting(视频个体计数的分解与推理)
\n[paper](https://arxiv.org/abs/2203.12335) | [code](https://github.com/taohan10200/DRNet)

\n\n[2] Represent, Compare, and Learn: A Similarity-Aware Framework for Class-Agnostic Counting(表示、比较和学习：用于类不可知计数的相似性感知框架)
\n[paper](https://arxiv.org/abs/2203.08354) | [code](https://github.com/flyinglynx/Bilinear-Matching-Network)

\n\n[1] Boosting Crowd Counting via Multifaceted Attention(通过多方面注意提高人群计数)
\n[paper](https://arxiv.org/pdf/2203.02636.pdf) | [code](https://github.com/LoraLinH/Boosting-Crowd-Counting-via-Multifaceted-Attention)

\n\n
\n\n \n\n\n## 联邦学习(Federated Learning)\n\n[5] FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning(用于异构联邦学习的基于相关性的主动客户端选择策略)
\n[paper](https://arxiv.org/abs/2103.13822)

\n\n[4] FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction(通过局部漂移解耦和校正与非 IID 数据进行联邦学习)
\n[paper](https://arxiv.org/abs/2203.11751) | [code](https://github.com/gaoliang13/FedDC)

\n\n[3] Federated Class-Incremental Learning(联邦类增量学习)
\n[paper](https://arxiv.org/abs/2203.11473) | [code](https://github.com/conditionWang/FCIL)

\n\n[2] Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning(通过非 IID 联邦学习的无数据知识蒸馏微调全局模型)
\n[paper](https://arxiv.org/abs/2203.09249)

\n\n[1] Differentially Private Federated Learning with Local Regularization and Sparsification(局部正则化和稀疏化的差分私有联邦学习)
\n[paper](https://arxiv.org/abs/2203.03106)
\n\n

\n\n \n\n## 其他\n\nLess is More: Generating Grounded Navigation Instructions from Landmarks(从地标生成接地导航指令)(视觉导航)
\n[paper](https://arxiv.org/abs/2111.12872)

\n\nFast, Accurate and Memory-Efficient Partial Permutation Synchronization(快速、准确和内存高效的部分置换同步)
\n[paper](https://arxiv.org/abs/2203.16505)

\n\nInteractive Disentanglement: Learning Concepts by Interacting with their Prototype Representations(通过与原型表示交互来学习概念)
\n[paper](https://arxiv.org/abs/2112.02290) | [code](https://github.com/ml-research/XIConceptLearning)

\n\nClean Implicit 3D Structure from Noisy 2D STEM Images(从嘈杂的 2D STEM 图像中清除隐式 3D 结构)
\n[paper](https://arxiv.org/abs/2203.15434)

\n\nZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds(二维点云的通用旋转等变架构)
\n[paper](https://arxiv.org/abs/2111.15341)

\n\nMDAN: Multi-level Dependent Attention Network for Visual Emotion Analysis(用于视觉情感分析的多级依赖注意网络)
\n[paper](https://arxiv.org/abs/2203.13443)

\n\nMoving Window Regression: A Novel Approach to Ordinal Regression(序数回归的一种新方法)
\n[paper](https://arxiv.org/abs/2203.13122) | [code](https://github.com/nhshin-mcl/MWR)

\n\nHierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction(用于有效降维的分层最近邻图嵌入)
\n[paper](https://arxiv.org/abs/2203.12997) | [code](https://github.com/koulakis/h-nne)

\n\nTransVPR: Transformer-based place recognition with multi-level attention aggregation(具有多级注意力聚合的基于 Transformer 的**位置识别**)(**图像匹配**)
\n[paper](https://arxiv.org/abs/2201.02001)

\n\nEv-TTA: Test-Time Adaptation for Event-Based Object Recognition(基于事件的对象识别的测试时间适应)
\n[paper](https://arxiv.org/abs/2203.12247)

\n\nLearning from All Vehicles(向所有车辆学习)(自动驾驶)
\n[paper](https://arxiv.org/abs/2203.11934) | [code](https://github.com/dotchen/LAV) | [demo](https://dotchen.github.io/LAV/)

\n\nMixed Differential Privacy in Computer Vision(计算机视觉中的混合差分隐私)
\n[paper](https://arxiv.org/abs/2203.11481)

\n\nRobust and Accurate Superquadric Recovery: a Probabilistic Approach(稳健且准确的超二次曲线恢复：一种概率方法)
\n[paper](https://arxiv.org/abs/2111.14517) | [code](http://github.com/bmlklwx/EMS-superquadric_fitting.git)

\n\nAirObject: A Temporally Evolving Graph Embedding for Object Identification(用于对象识别的时间演化图嵌入)(**object encoding**)
\n[paper](https://arxiv.org/abs/2111.15150) | [code](https://github.com/Nik-V9/AirObject)

\n\nFastDOG: Fast Discrete Optimization on GPU(GPU 上的快速离散优化)
\n[paper](https://arxiv.org/abs/2111.10270) | [code](https://github.com/LPMP/BDD)

\n\nNeural Collaborative Graph Machines for Table Structure Recognition(用于表结构识别的神经协同图机)
\n[paper](https://arxiv.org/abs/2111.13359)

\n\nContrastive Conditional Neural Processes(对比条件神经过程)
\n[paper](https://arxiv.org/pdf/2203.03978.pdf)

\n\nDeep Rectangling for Image Stitching: A Learning Baseline(图像拼接的深度矩形：学习基线)(**Image Stitching**)
\n[paper](https://arxiv.org/abs/2203.03831) | [code](https://github.com/nie-lang/DeepRectangling)

\n\nOnline Learning of Reusable Abstract Models for Object Goal Navigation(对象目标导航可重用抽象模型的在线学习)
\n[paper](https://arxiv.org/abs/2203.02583)

\n\nPINA: Learning a Personalized Implicit Neural Avatar from a Single RGB-D Video Sequence(PINA：从单个 RGB-D 视频序列中学习个性化的隐式神经化身)
\n[paper](https://arxiv.org/abs/2203.01754) | [video](https://youtu.be/oGpKUuD54Qk) | [project](https://zj-dong.github.io/pina/)

\n\n
\n\n
\n\n \n\n\n# 2. CVPR2022 Oral\n\n[21] What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions(看什么和在哪里看：语义和空间精炼transformer，用于检测人与物体的交互)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.00746)

\n\n[20] Marginal Contrastive Correspondence for Guided Image Generation(引导图像生成的边际对比对应)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.00442)

\n\n[19] TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting(使用 Transformer 编码多尺度时间相关性以进行重复动作计数)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.01018) | [dataset](https://svip-lab.github.io/dataset/RepCount_dataset.html) | [code](https://github.com/SvipRepetitionCounting/TransRAC)

\n\n[18] Learning Part Segmentation through Unsupervised Domain Adaptation from Synthetic Vehicles(通过合成车辆的无监督域适应学习局部分割)(**Oral**)
\n[paper](https://arxiv.org/abs/2103.14098) | [dataset](https://qliu24.github.io/udapart)

\n\n[17] Semantic-Aware Domain Generalized Segmentation(语义感知领域广义分割)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.00822) | [code](https://github.com/leolyj/SAN-SAW)

\n\n[16] Revisiting Skeleton-based Action Recognition(重新审视基于骨架的动作识别)(**Oral**)
\n[paper](https://arxiv.org/abs/2104.13586) | [code](https://github.com/kennymckormick/pyskl)

\n\n[15] MAXIM: Multi-Axis MLP for Image Processing(用于图像处理的多轴 MLP)(**Oral**)
\n[papar](https://arxiv.org/abs/2201.02973) | [code](https://github.com/google-research/maxim)

\n\n[14] Rethinking Minimal Sufficient Representation in Contrastive Learning(重新思考对比学习中的最小充分表示)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.07004) | [code](https://github.com/Haoqing-Wang/InfoCL)

\n\n[13] I M Avatar: Implicit Morphable Head Avatars from Videos(视频中的隐式可变形头部头像)(**Oral**)
\n[paper](https://arxiv.org/abs/2112.07471) | [project](https://ait.ethz.ch/projects/2022/IMavatar/)

\n\n[12] Parameter-free Online Test-time Adaptation(无参数在线测试时间自适应)(**Oral**)
\n[paper](https://arxiv.org/abs/2201.05718) | [code](https://github.com/fiveai/LAME)

\n\n[11] Correlation Verification for Image Retrieval(图像检索的相关性验证)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.01458) | [code](https://github.com/sungonce/CVNet)

\n\n[10] Rethinking Semantic Segmentation: A Prototype View(重新思考语义分割：原型视图)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.15102) | [code](https://github.com/tfzhou/ProtoSeg)

\n\n[9] SNUG: Self-Supervised Neural Dynamic Garments(自我监督的神经动态服装)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.02219) | [project](http://mslab.es/projects/SNUG/)

\n\n[8] SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video(从单目视频自我重建你的数字化身)(**Oral**)
\n[paper](https://arxiv.org/abs/2201.12792) | [code](https://github.com/jby1993/SelfReconCode)

\n\n[7] Dual-AI: Dual-path Action Interaction Learning for Group Activity Recognition(用于群体动作识别的双路径动作交互学习)(**Oral**)
\n[paper](https://arxiv.org/abs/2204.02148) | [project](https://arxiv.org/pdf/2204.02148)

\n\n[6] 3D Common Corruptions and Data Augmentation(3D 常见损坏和数据增强)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.01441) | [project](https://3dcommoncorruptions.epfl.ch/)

\n\n[5] GAN-Supervised Dense Visual Alignment(GAN监督的密集视觉对齐)(**Oral**)
\n[paper](https://arxiv.org/abs/2112.05143) | [code](https://www.github.com/wpeebles/gangealing) | [project](https://www.wpeebles.com/gangealing)

\n\n[4] It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher(一切尽在老师身上：零样本量化更贴近老师)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.17008)

\n\n[3] AdaMixer: A Fast-Converging Query-Based Object Detector(一种快速收敛的基于查询的对象检测器)(**Oral**)
\n[paper](https://arxiv.org/abs/2203.16507) | [code](https://github.com/MCG-NJU/AdaMixer)

\n\n[2] Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry(通过融合单视图深度概率与多视图几何进行多视图深度估计)(**Oral**)
\n[paper](https://arxiv.org/abs/2112.08177) | [code](https://github.com/baegwangbin/MaGNet)

\n\n[1] L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) **(视觉语言表征学习)**
\n[paper](https://arxiv.org/abs/2111.11133)

\n\n
\n\n
\n\n \n\n# 3. CVPR2022 论文解读汇总\n\n【22】[MLP才是无监督学习比监督学习迁移性能好的关键因素](https://bbs.cvmart.net/articles/6191)

\n\n【21】[精准高效估计多人3D姿态，美图&北航联合提出分布感知式单阶段模型](https://mp.weixin.qq.com/s/UAtqZezVddSetn6Y_YFq9Q)

\n\n【20】[利用域自适应思想，北大、字节跳动提出新型弱监督物体定位框架](https://bbs.cvmart.net/articles/6197)

\n\n【19】[只用一张图+相机走位，AI就能脑补周围环境](https://mp.weixin.qq.com/s/wPiZ5N1bVxFgayej6LbJwQ)

\n\n【18】[Point-BERT: 基于掩码建模的点云自注意力模型预训练](https://bbs.cvmart.net/articles/6200)

\n\n【17】[Swin Transformer迎来30亿参数的v2.0，我们应该拥抱视觉大模型吗？](https://bbs.cvmart.net/articles/6202)

\n\n【16】[Adobe把GAN搞成了缝合怪，凭空P出一张1024分辨率全身人像](https://bbs.cvmart.net/articles/6205)

\n\n【15】[中国科大等提出点云连续隐式表示 Neural Points：上采样任务效果惊艳](https://bbs.cvmart.net/articles/6201)

\n\n【14】[马普所开源 ICON：显著提高单张图像重建三维数字人的姿势水平](https://bbs.cvmart.net/articles/6186)

\n\n【13】[图像也是德布罗意波！华为诺亚&北大提出量子启发 MLP，性能超越 Swin Transfomer](https://bbs.cvmart.net/articles/6182)

\n\n【12】[群核前沿院等提出首个基于数据驱动的面检测算法](https://bbs.cvmart.net/articles/6184)

\n\n【11】[MPViT：用于密集预测的多路径视觉Transformer](https://bbs.cvmart.net/articles/6183)

\n\n【10】[ST++: 半监督语义分割中更优的自训练范式](https://bbs.cvmart.net/articles/6170)

\n\n【9】[CNN自监督预训练新SOTA！上交等联合提出HCSC：具有层级结构的图像表征自学习新框架](https://mp.weixin.qq.com/s/tVE0Zo0xjKaM4UJ4ouOjhg)

\n\n【8】[Restormer: 刷新多个low-level任务指标](https://mp.weixin.qq.com/s/tFIZF7sLzJ29jph0_EYyvg)

\n\n【7】[百变发型！中科大等提出HairCLIP：基于文本和参考图像的头发编辑方法](https://mp.weixin.qq.com/s/v9rExQBXCd3qEbmzb5XKjQ)

\n\n\n【6】[凭什么 31x31 大小卷积核的耗时可以和 9x9 卷积差不多？](https://zhuanlan.zhihu.com/p/479182218)\n[RepLKNet: 大核卷积+结构重参数让CNN再次伟大](https://zhuanlan.zhihu.com/p/480935774)

\n\n\n【5】[U2PL: 使用不可靠伪标签的半监督语义分割](https://bbs.cvmart.net/articles/6163)

\n\n\n【4】[针对目标检测的重点与全局知识蒸馏(FGD)](https://bbs.cvmart.net/articles/6169)

\n\n\n【3】[即插即用！助力自监督涨点的ContrastiveCrop开源了！](https://bbs.cvmart.net/articles/6157)

\n\n\n【2】[从原理和代码详解FAIR的惊艳之作：全新的纯卷积模型ConvNeXt](https://bbs.cvmart.net/articles/6113)\n[“文艺复兴” ConvNet卷土重来，压过Transformer！FAIR重新设计纯卷积新架构](https://bbs.cvmart.net/articles/6008)

\n\n\n【1】[南开程明明团队和天大提出LD：目标检测的定位蒸馏](https://zhuanlan.zhihu.com/p/474955539)

\n\n\n\n
\n\n \n\n# 4. CVPR2022论文分享\n\n
\n\n
\n\n \n\n# 5. To do list\n\n* CVPR2022 Workshop\n" }, { "path": "CVPR2023.md", "content": "* 推荐阅读：
\n * [ICCV2021/2019/2017 论文/代码/解读/直播合集](https://github.com/extreme-assistant/ICCV2021-Paper-Code-Interpretation)\n * [2020-2021年计算机视觉综述论文汇总](https://github.com/extreme-assistant/survey-computer-vision)\n * [国内外优秀的计算机视觉团队汇总](https://github.com/extreme-assistant/Awesome-CV-Team)\n\n------\n\n# CVPR2023最新信息及论文下载（Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等）\n\n官网链接：https://cvpr.thecvf.com/Conferences/2023
\n论文接收公布时间：2023年2月28日
\n\n相关问题：[如何评价 CVPR 2023 的论文接收结果？](https://www.zhihu.com/question/585474435)
\n相关报道：[CVPR 2023 接收结果出炉！录用2360篇，接收数量上升12%](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247637344&idx=1&sn=62f70870c3f76e6b5a401373986a6ee5&chksm=ec123499db65bd8ffc8987ec1f61b1389e666ed32772149e7c7e76923e2f427ef57e55b57d72&token=1890936456&lang=zh_CN#rd)\n\n\n>update: （更新附打包下载链接）
\n>2023/2/28 [更新13篇](https://www.cvmart.net/community/detail/7212)
\n>2023/3/02 [更新54篇](https://www.cvmart.net/community/detail/7388)
\n>2023/3/09 [更新35篇](https://www.cvmart.net/community/detail/7403)
\n>2023/3/15 [更新29篇](https://www.cvmart.net/community/detail/7419)
\n>2023/3/16 [更新8篇](https://www.cvmart.net/community/detail/7421)
\n>2023/3/17 [更新19篇](https://www.cvmart.net/community/detail/7428)
\n>2023/3/20 [更新37篇](https://www.cvmart.net/community/detail/7435)
\n>2023/3/22 [更新61篇](https://www.cvmart.net/community/detail/7444)
\n>2023/3/23 [更新55篇](https://www.cvmart.net/community/detail/7449)
\n>2023/3/24 [更新70篇](https://www.cvmart.net/community/detail/7454)
\n>2023/3/25 [更新99篇](https://www.cvmart.net/community/detail/7464)
\n>2023/3/26 [更新23篇](https://www.cvmart.net/community/detail/7465)
\n>2023/3/29 [更新101篇](https://www.cvmart.net/community/detail/7470)
\n>2023/3/31 [更新89篇](https://www.cvmart.net/community/detail/7480)
\n>2023/4/11 [更新127篇](https://www.cvmart.net/community/detail/7509)
\n>2023/4/12 [更新48篇](https://www.cvmart.net/community/detail/7514)
\n>2023/4/13 [更新51篇](https://www.cvmart.net/community/detail/7520)
\n\n

\n\n# 目录\n\n[1. CVPR2023 接受论文/代码分方向汇总（更新中）](#1)
\n[2. CVPR2023 spotlight（更新中）](#2)
\n[3. CVPR2023 论文解读汇总（更新中）](#3)
\n[4. CVPR2023 极市论文分享](#4)
\n[5. To do list](#5)
\n\n
\n\n \n\n# 1.CVPR2023接受论文/代码分方向整理(持续更新)\n\n\n## 分类目录：\n\n### [1. 检测](#detection)\n\n* [2D目标检测(2D Object Detection)](#IOD)\n* [视频目标检测(Video Object Detection)](#VOD)\n* [3D目标检测(3D Object Detection)](#3DOD)\n* [人物交互检测(HOI Detection)](#HOI)\n* [伪装目标检测(Camouflaged Object Detection)](#COD)\n* [旋转目标检测(Rotation Object Detection)](#ROD)\n* [显著性目标检测(Saliency Object Detection)](#SOD)\n* [关键点检测(Keypoint Detection)](#KeypointDetection)\n* [车道线检测(Lane Detection)](#LaneDetection)\n* [边缘检测(Edge Detection)](#EdgeDetection)\n* [消失点检测(Vanishing Point Detection)](#VPD)\n* [异常检测(Anomaly Detection)](#AnomalyDetection)\n\n### [2. 分割(Segmentation)](#Segmentation)\n\n* [图像分割(Image Segmentation)](#ImageSegmentation)\n* [全景分割(Panoptic Segmentation)](#PanopticSegmentation)\n* [语义分割(Semantic Segmentation)](#SemanticSegmentation)\n* [实例分割(Instance Segmentation)](#InstanceSegmentation)\n* [超像素(Superpixel)](#Superpixel)\n* [视频目标分割(Video Object Segmentation)](#VOS)\n* [抠图(Matting)](#Matting)\n* [密集预测(Dense Prediction)](#DensePrediction)\n\n### [3. 图像处理(Image Processing)](#ImageProcessing)\n\n* [超分辨率(Super Resolution)](#SuperResolution)\n* [图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)](#ImageRestoration)\n* [图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)](#ISR)\n* [图像去噪/去模糊/去雨去雾(Image Denoising)](#ImageDenoising)\n* [图像编辑/图像修复(Image Edit/Image Inpainting)](#ImageEdit)\n* [图像翻译(Image Translation)](#ImageTranslation)\n* [图像质量评估(Image Quality Assessment)](#IQA)\n* [风格迁移(Style Transfer)](#StyleTransfer)\n* [图像配准(Image Registration)](#ImageRegistration)\n\n### [4. 视频处理(Video Processing)](#VideoProcessing)\n\n* [视频编辑(Video Editing)](#VideoEditing)\n* [视频生成/视频合成(Video Generation/Video Synthesis)](#VideoGeneration)\n* [视频超分(Video Super-Resolution)](#VideoSR)\n\n### [5. 估计(Estimation)](#Estimation)\n\n* [光流/运动估计(Flow/Motion Estimation)](#Flow/Pose/MotionEstimation)\n* [深度估计(Depth Estimation)](#DepthEstimation)\n* [人体解析/人体姿态估计(Human Parsing/Human Pose Estimation)](#HumanPoseEstimation)\n* [手势估计(Gesture Estimation)](#GestureEstimation)\n\n### [6. 图像&视频检索/(Image&Video Retrieval/Video Understanding)](#ImageRetrieval)\n\n* [行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)](#ActionRecognition)\n* [行人重识别/检测(Re-Identification/Detection)](#Re-Identification)\n* [图像/视频字幕(Image/Video Caption)](#VideoCaption)\n\n### [7. 人脸(Face)](#Face)\n\n* [人脸识别/检测(Facial Recognition/Detection)](#FacialRecognition)\n* [人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)](#FaceSynthesis)\n* [人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)](#FaceAnti-Spoofing)\n\n### [8. 三维视觉(3D Vision)](#3DVision)\n\n* [点云(Point Cloud)](#3DPC)\n* [三维重建(3D Reconstruction)](#3DReconstruction)\n* [场景重建/视图合成/新视角合成(Novel View Synthesis)](#NeRF)\n\n### [9. 目标跟踪(Object Tracking)](#ObjectTracking)\n\n### [10. 医学影像(Medical Imaging)](#MedicalImaging)\n\n### [11. 文本检测/识别/理解(Text Detection/Recognition/Understanding)](#TDR)\n\n### [12. 遥感图像(Remote Sensing Image)](#RSI)\n\n### [13. GAN/生成式/对抗式(GAN/Generative/Adversarial)](#GAN)\n\n### [14. 图像生成/图像合成(Image Generation/Image Synthesis)](#IGIS)\n\n### [15. 场景图(Scene Graph](#SG)\n\n* [场景图生成(Scene Graph Generation)](#SGG)\n* [场景图预测(Scene Graph Prediction)](#SGP)\n* [场景图理解(Scene Graph Understanding)](#SGU)\n\n### [16. 视觉定位/位姿估计(Visual Localization/Pose Estimation)](#VisualLocalization)\n\n### [17. 视觉推理/视觉问答(Visual Reasoning/VQA)](#VisualReasoning)\n\n### [18. 视觉预测(Vision-based Prediction)](#Vision-basedPrediction)\n\n### [19. 神经网络结构设计(Neural Network Structure Design)](#NNS)\n\n* [CNN](#CNN)\n* [Transformer](#Transformer)\n* [图神经网络(GNN)](#GNN)\n* [神经网络架构搜索(NAS)](#NAS)\n* [MLP](#MLP)\n\n### [20. 神经网络可解释性(Neural Network Interpretability)](#interpretability)\n\n### [21. 数据集(Dataset)](#Dataset)\n\n### [22. 数据处理(Data Processing)](#DataProcessing)\n\n* [数据增广(Data Augmentation)](#DataAugmentation)\n* [归一化/正则化(Batch Normalization)](#BatchNormalization)\n* [图像聚类(Image Clustering)](#ImageClustering)\n* [图像压缩(Image Compression)](#ImageCompression)\n\n### [23. 图像特征提取与匹配(Image feature extraction and matching)](#matching)\n\n### [24. 视觉表征学习(Visual Representation Learning)](#VisualRL)\n\n### [25. 模型训练/泛化(Model Training/Generalization)](#ModelTraining)\n\n* [噪声标签(Noisy Label)](#NoisyLabel)\n* [长尾分布(Long-Tailed Distribution)](#Long-Tailed)\n\n### [26. 模型压缩(Model Compression)](#ModelCompression)\n\n* [知识蒸馏(Knowledge Distillation)](#KnowledgeDistillation)\n* [剪枝(Pruning)](#Pruning)\n* [量化(Quantization)](#Quantization)\n\n### [27. 模型评估(Model Evaluation)](#ModelEvaluation)\n\n### [28. 图像分类(Image Classification)](#ImageClassification)\n\n### [29. 图像计数(Image Counting)](#CrowdCounting)\n\n### [30. 机器人(Robotic)](#Robotic)\n\n### [31. 半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)](#self-supervisedlearning)\n\n### [32. 多模态学习(Multi-Modal Learning)](#MMLearning)\n\n* [视听学习(Audio-visual Learning)](#Audio-VisualLearning)\n* [视觉-语言（Vision-language）](#VLRL)\n\n### [33. 主动学习(Active Learning)](#ActiveLearning)\n\n### [34. 小样本学习/零样本学习(Few-shot/Zero-shot Learning)](#Few-shotLearning)\n\n### [35. 持续学习(Continual Learning/Life-long Learning)](#ContinualLearning)\n\n### [36. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)](#domain)\n\n### [37. 度量学习(Metric Learning)](#MetricLearning)\n\n### [38. 对比学习(Contrastive Learning)](#ContrastiveLearning)\n\n### [39. 增量学习(Incremental Learning)](#IncrementalLearning)\n\n### [40. 强化学习(Reinforcement Learning)](#RL)\n\n### [41. 元学习(Meta Learning)](#MetaLearning)\n\n### [42. 联邦学习(Federated Learning](#federatedlearning)\n\n### [43. 自动驾驶(Federated Learning](#automatic driving)\n\n\n\n\n### [其他](#100)\n\n\n\n

\n\n \n\n## 检测\n\n\n\n
\n\n \n\n### 2D目标检测(2D Object Detection)\n\n[14]DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
\n[paper](https://arxiv.org/abs/2304.04514)

\n\n[13]Benchmarking the Physical-world Adversarial Robustness of Vehicle Detection
\n[paper](https://arxiv.org/abs/2304.05098)

\n\n[12]Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision
\n[paper](https://arxiv.org/abs/2304.01484) | [code](https://github.com/xinyiying/lesps)

\n\n[11]Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains
\n[paper](https://arxiv.org/abs/2304.02950)

\n\n[10]Continual Detection Transformer for Incremental Object Detection
\n[paper](https://arxiv.org/abs/2304.03110)

\n\n[9]Object Discovery from Motion-Guided Tokens
\n[paper](https://arxiv.org/abs/2303.15555) | [code](https://github.com/zpbao/motok)

\n\n[8]What Can Human Sketches Do for Object Detection?
\n[paper](https://arxiv.org/abs/2303.15149)

\n\n[7]NeRF-RPN: A general framework for object detection in NeRFs
\n[paper](https://arxiv.org/abs/2211.11646)

\n\n[6]Detecting Everything in the Open World: Towards Universal Object Detection
\n[paper](https://arxiv.org/abs/2303.11749)

\n\n[5]Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
\n[paper](https://arxiv.org/abs/2303.05892)

\n\n[4]CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
\n[paper](https://arxiv.org/abs/2303.02489)

\n\n[3]Enhanced Training of Query-Based Object Detection via Selective Query Recollection
\n[paper](https://arxiv.org/abs/2212.07593) | [code](https://github.com/Fangyi-Chen/SQR)

\n\n[2]DETRs with Hybrid Matching
\n[paper](https://arxiv.org/abs/2207.13080) | [code](https://github.com/HDETR)

\n\n[1]YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors(YOLOv7)
\n[paper](https://arxiv.org/abs/2207.02696) | [code](https://github.com/WongKinYiu/yolov7)

\n\n
\n\n
\n\n\n \n\n### 视频目标检测(Video Object Detection)\n\n[4]Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video
\n[paper](https://arxiv.org/abs/2303.16053) | [code](https://github.com/wenzhengzeng/mpeblink)

\n\n[3]Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
\n[paper](https://arxiv.org/abs/2303.14768) | [code](https://github.com/tencentyouturesearch/highlightdetection-clc)

\n\n[2]3D Video Object Detection with Learnable Object-Centric Global Optimization
\n[paper](https://arxiv.org/abs/2303.15416) | [code](https://github.com/jiaweihe1996/ba-det)

\n\n[1]SCOTCH and SODA: A Transformer Video Shadow Detection Framework
\n[paper](https://arxiv.org/abs/2211.06885)

\n\n
\n\n
\n\n \n\n### 3D目标检测(3D object detection)\n\n[28]Curricular Object Manipulation in LiDAR-based Object Detection
\n[paper](https://arxiv.org/abs/2304.04248) | [code](https://github.com/zzy816/com)

\n\n[27]Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection
\n[paper](https://arxiv.org/abs/2304.01464) | [code](https://github.com/azhuantou/hssda)

\n\n[26]Understanding the Robustness of 3D Object Detection with Bird's-Eye-View Representations in Autonomous Driving
\n[paper](https://arxiv.org/abs/2303.17297)

\n\n[25]Learned Two-Plane Perspective Prior based Image Resampling for Efficient Object Detection
\n[paper](https://arxiv.org/abs/2303.14311)

\n\n[24]Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images
\n[paper](https://arxiv.org/abs/2303.14488) | [code](https://github.com/cuogeihong/ceasc)

\n\n[23]Viewpoint Equivariance for Multi-View 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.14548) | [code](https://github.com/tri-ml/vedet)

\n\n[22]Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans
\n[paper](https://arxiv.org/abs/2203.09375)

\n\n[21]itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection
\n[paper](https://arxiv.org/abs/2205.15531)

\n\n[20]Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
\n[paper](https://arxiv.org/abs/2207.10660) | [code](https://github.com/facebookresearch/omni3d)

\n\n[19]FrustumFormer: Adaptive Instance-aware Resampling for Multi-view 3D Detection
\n[paper](https://arxiv.org/abs/2301.04467) | [code](https://github.com/robertwyq/frustum)

\n\n[18]NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations
\n[paper](https://arxiv.org/abs/2303.13483)

\n\n[17]Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving
\n[paper](https://arxiv.org/abs/2303.11040)

\n\n[16]VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
\n[paper](https://arxiv.org/abs/2303.11301) | [code](https://github.com/dvlab-research/VoxelNeXt)

\n\n[15]OcTr: Octree-based Transformer for 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.12621)

\n\n[14]MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer
\n[paper](https://arxiv.org/abs/2303.13018)

\n\n[13]CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.10209) | [code](https://github.com/PaddlePaddle/Paddle3D)

\n\n[12]Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency
\n[paper](https://arxiv.org/abs/2303.08686)

\n\n[11]AeDet: Azimuth-invariant Multi-view 3D Object Detection
\n[paper](https://arxiv.org/abs/2211.12501) | [code](https://github.com/fcjian/AeDet)

\n\n[10]Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.05886)

\n\n[9]PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.08129) | [code](https://github.com/blvlab/pimae)

\n\n[8]MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences
\n[paper](https://arxiv.org/abs/2303.08316)

\n\n[7]Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View
\n[paper](https://arxiv.org/abs/2303.01686)

\n\n[6]X3KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.02203)

\n\n[5]Virtual Sparse Convolution for Multimodal 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.02314) | [code](https://github.com/hailanyi/virconv)

\n\n[4]MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
\n[paper](https://arxiv.org/abs/2209.03102) | [code](https://github.com/sxjyjay/msmdfusion)

\n\n[3]Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection
\n[paper](https://arxiv.org/abs/2303.06880) | [code](https://github.com/PJLab-ADG/3DTrans)

\n\n[2]LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion
\n[paper](https://arxiv.org/abs/2303.03595) | [code](https://github.com/sankin97/LoGoNet)

\n\n[1]ConQueR: Query Contrast Voxel-DETR for 3D Object Detection(3D 目标检测的Query Contrast Voxel-DETR)\n[paper](https://arxiv.org/abs/2212.07289) | [code](https://github.com/poodarchu/ConQueR)

\n\n
\n\n
\n\n \n\n### 人物交互检测(HOI Detection)\n\n[4]Relational Context Learning for Human-Object Interaction Detection
\n[paper](https://arxiv.org/abs/2304.04997)

\n\n[3]Instant-NVR: Instant Neural Volumetric Rendering for Human-object Interactions from Monocular RGBD Stream
\n[paper](https://arxiv.org/abs/2304.03184)

\n\n[2]Category Query Learning for Human-Object Interaction Classification
\n[paper](https://arxiv.org/abs/2303.14005)

\n\n[1]Detecting Human-Object Contact in Images
\n[paper](https://arxiv.org/abs/2303.03373)

\n\n
\n\n
\n\n \n\n### 伪装目标检测(Camouflaged Object Detection)\n\n[1]Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers
\n[paper](https://arxiv.org/abs/2303.14816) | [code](https://github.com/zhouhuang23/fspnet)

\n\n
\n\n
\n\n \n\n### 旋转目标检测(Rotation Object Detection)\n\n
\n\n \n\n### 显著性目标检测(Saliency Object Detection)\n\n[2]Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings
\n[paper](https://arxiv.org/abs/2303.11502)

\n\n\n[1]Texture-guided Saliency Distilling for Unsupervised Salient Object Detection
\n[paper](https://arxiv.org/abs/2207.05921) | [code](https://github.com/moothes/A2S-v2)

\n\n
\n\n\n
\n\n \n\n### 关键点检测(Keypoint Detection)\n\n[2]Few-shot Geometry-Aware Keypoint Localization
\n[paper](https://arxiv.org/abs/2303.17216)

\n\n[1]Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling
\n[paper](https://arxiv.org/abs/2303.15270)

\n\n
\n\n
\n\n \n\n### 车道线检测(Lane Detection)\n\n[1]BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline
\n[paper](https://arxiv.org/abs/2210.06006)

\n\n
\n\n
\n\n \n\n### 边缘检测(Edge Detection)\n\n[2]The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector
\n[paper](https://arxiv.org/abs/2303.11828) | [code](https://github.com/zhoucx117/uaed)

\n\n[1]Iterative Next Boundary Detection for Instance Segmentation of Tree Rings in Microscopy Images of Shrub Cross Sections
\n[paper](https://arxiv.org/abs/2212.03022) | [code](https://github.com/alexander-g/inbd)

\n\n
\n\n
\n\n \n\n### 消失点检测(Vanishing Point Detection)\n\n
\n\n
\n\n \n\n### 异常检测(Anomaly Detection)\n\n[14]Video Event Restoration Based on Keyframes for Video Anomaly Detection
\n[paper](https://arxiv.org/abs/2304.05112)

\n\n[13]Robust Outlier Rejection for 3D Registration with Variational Bayes
\n[paper](https://arxiv.org/abs/2304.01514) | [code](https://github.com/jiang-hb/vbreg)

\n\n[12]OpenMix: Exploring Outlier Samples for Misclassification Detection
\n[paper](https://arxiv.org/abs/2303.17093) | [code](https://github.com/Impression2805/OpenMix)

\n\n[11]WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
\n[paper](https://arxiv.org/abs/2303.14814)

\n\n[10]SimpleNet: A Simple Network for Image Anomaly Detection and Localization
\n[paper](https://arxiv.org/abs/2303.15140) | [code](https://github.com/donaldrr/simplenet)

\n\n[9]Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features
\n[paper](https://arxiv.org/abs/2303.15167)

\n\n[8]SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection
\n[paper](https://arxiv.org/abs/2111.13495)

\n\n[7]Normalizing Flow based Feature Synthesis for Outlier-Aware Object Detection
\n[paper](https://arxiv.org/abs/2302.07106)

\n\n[6]Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection
\n[paper](https://arxiv.org/abs/2303.13051)

\n\n[5]DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection
\n[paper](https://arxiv.org/abs/2211.11317)

\n\n[4]Diversity-Measurable Anomaly Detection
\n[paper](https://arxiv.org/abs/2303.05047)

\n\n[3]Block Selection Method for Using Feature Norm in Out-of-distribution Detection
\n[paper](https://arxiv.org/abs/2212.02295)

\n\n[2]Lossy Compression for Robust Unsupervised Time-Series Anomaly Detection
\n[paper](https://arxiv.org/abs/2212.02303)

\n\n[1]Multimodal Industrial Anomaly Detection via Hybrid Fusion
\n[paper](https://arxiv.org/abs/2303.00601) | [code](https://github.com/nomewang/M3DM)

\n\n
\n\n\n
\n\n \n\n\n## 分割(Segmentation)\n\n
\n\n \n\n### 图像分割(Image Segmentation)\n\n[7]FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation
\n[paper](https://arxiv.org/abs/2303.17225)

\n\n[6]Zero-shot Referring Image Segmentation with Global-Local Context Features
\n[paper](https://arxiv.org/abs/2303.17811) | [code](https://github.com/seonghoon-yu/zero-shot-ris)

\n\n[5]Parameter Efficient Local Implicit Image Function Network for Face Segmentation
\n[paper](https://arxiv.org/abs/2303.15122)

\n\n[4]EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision
\n[paper](https://arxiv.org/abs/2303.15440)

\n\n[3]Focused and Collaborative Feedback Integration for Interactive Image Segmentation
\n[paper](https://arxiv.org/abs/2303.11880) | [code](https://github.com/veizgyauzgyauz/fcfi)

\n\n[2]MP-Former: Mask-Piloted Transformer for Image Segmentation
\n[paper](https://arxiv.org/abs/2303.07336) | [code](https://github.com/IDEA-Research/MP-Former)

\n\n[1]Interactive Segmentation as Gaussian Process Classification
\n[paper](https://arxiv.org/abs/2302.14578)

\n\n
\n\n
\n\n \n\n### 全景分割(Panoptic Segmentation)\n\n[3]You Only Segment Once: Towards Real-Time Panoptic Segmentation
\n[paper](https://arxiv.org/abs/2303.14651) | [code](https://github.com/hujiecpp/yoso)

\n\n[2]UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration
\n[paper](https://arxiv.org/abs/2206.15083)

\n\n[1]Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
\n[paper](https://arxiv.org/abs/2303.04803)

\n\n
\n\n
\n\n \n\n### 语义分割(Semantic Segmentation)\n\n[28]Federated Incremental Semantic Segmentation
\n[paper](https://arxiv.org/abs/2304.04620) | [code](https://github.com/jiahuadong/fiss)

\n\n[27]Continual Semantic Segmentation with Automatic Memory Sample Selection
\n[paper](https://arxiv.org/abs/2304.05015)

\n\n[26]DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation
\n[paper](https://arxiv.org/abs/2304.02222) | [code](https://github.com/fy-vision/diga)

\n\n[25]Exploiting the Complementarity of 2D and 3D Networks to Address Domain-Shift in 3D Semantic Segmentation
\n[paper](https://arxiv.org/abs/2304.02991) | [code](https://github.com/cvlab-unibo/mm2d3d)

\n\n[24]3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds
\n[paper](https://arxiv.org/abs/2304.00690) | [code](https://github.com/xiaoaoran/semanticstf)

\n\n[23]Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.14360)

\n\n[22]Instant Domain Augmentation for LiDAR Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.14378)

\n\n[21]Leveraging Hidden Positives for Unsupervised Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.15014) | [code](https://github.com/hynnsk/hp)

\n\n[20]LaserMix for Semi-Supervised LiDAR Semantic Segmentation
\n[paper](https://arxiv.org/abs/2207.00026) | [code](https://github.com/ldkong1205/LaserMix)

\n\n[19]Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation
\n[paper](https://arxiv.org/abs/2208.09910) | [code](https://github.com/LiheYoung/UniMatch)

\n\n[18]Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs
\n[paper](https://arxiv.org/abs/2212.00785) | [code](https://github.com/kakaobrain/tcl)

\n\n[17]Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.11203) | [code](https://github.com/l1997i/lim3d)

\n\n[16]Reliability in Semantic Segmentation: Are We on the Right Track?
\n[paper](https://arxiv.org/abs/2303.11298) | [code](https://github.com/naver/relis)

\n\n[15]Generative Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.11316) | [code](https://github.com/fudan-zvg/gss)

\n\n[14]Novel Class Discovery for 3D Point Cloud Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.11610) | [code](https://github.com/luigiriz/nops)

\n\n[13]MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving
\n[paper](https://arxiv.org/abs/2303.08600) | [code](https://github.com/jialeli1/lidarseg3d)

\n\n[12]Side Adapter Network for Open-Vocabulary Semantic Segmentation
\n[paper](https://arxiv.org/abs/2302.12242) | [code](https://github.com/mendelxu/san)

\n\n[11]Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes
\n[paper](https://arxiv.org/abs/2211.10206)

\n\n[10]Token Contrast for Weakly-Supervised Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.01267) | [code](https://github.com/rulixiang/toco)

\n\n[9]Delivering Arbitrary-Modal Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.01480) | [code](https://github.com/jamycheung/DELIVER)

\n\n[8]Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation
\n[paper](https://arxiv.org/abs/2211.12268)

\n\n[7]Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
\n[paper](http://arxiv.org/abs/2210.04150) | [code](https://github.com/facebookresearch/ov-seg)

\n\n[6]Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos
\n[paper](https://arxiv.org/abs/2303.07224) | [code](https://github.com/THU-LYJ-Lab/AR-Seg)

\n\n[5]SCPNet: Semantic Scene Completion on Point Cloud
\n[paper](https://arxiv.org/abs/2303.06884)

\n\n[4]On Calibrating Semantic Segmentation Models: Analyses and An Algorithm
\n[paper](https://arxiv.org/abs/2212.12053)

\n\n[3]Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
\n[paper](https://arxiv.org/abs/2301.09121)

\n\n[2]Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation
\n[paper](https://arxiv.org/abs/2208.09910) | [code](https://github.com/LiheYoung/UniMatch)

\n\n[1]Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation
\n[paper](https://arxiv.org/abs/2302.14250)

\n\n
\n\n
\n\n \n\n### 实例分割(Instance Segmentation)\n\n[11]Mask-Free Video Instance Segmentation
\n[paper](https://arxiv.org/abs/2303.15904) | [code](https://github.com/syscv/maskfreevis)

\n\n[10]Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
\n[paper](https://arxiv.org/abs/2303.16891)

\n\n[9]DoNet: Deep De-overlapping Network for Cytology Instance Segmentation
\n[paper](https://arxiv.org/abs/2303.14373)

\n\n[8]The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation
\n[paper](https://arxiv.org/abs/2303.15062)

\n\n[7]A Generalized Framework for Video Instance Segmentation
\n[paper](https://arxiv.org/abs/2211.08834) | [code](https://github.com/miranheo/genvis)

\n\n[6]FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation
\n[paper](https://arxiv.org/abs/2303.08594)

\n\n[5]SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation
\n[paper](https://arxiv.org/abs/2303.08578) | [code](https://github.com/lslrh/sim)

\n\n[4]DynaMask: Dynamic Mask Selection for Instance Segmentation
\n[paper](https://arxiv.org/abs/2303.07868) | [code](https://github.com/lslrh/dynamask)

\n\n[3]Beyond mAP: Towards better evaluation of instance segmentation
\n[paper](https://arxiv.org/abs/2207.01614)

\n\n[2]ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution
\n[paper](https://arxiv.org/abs/2303.00246)

\n\n[1]PolyFormer: Referring Image Segmentation as Sequential Polygon Generation(PolyFormer：将图像分割表述为顺序多边形生成)
\n[paper](https://arxiv.org/abs/2302.07387)

\n\n
\n\n
\n\n \n\n### 超像素(Superpixel)\n\n
\n\n \n\n### 视频目标分割(Video Object Segmentation)\n\n[5]Spatio-Temporal Pixel-Level Contrastive Learning-based Source-Free Domain Adaptation for Video Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.14361) | [code](https://github.com/shaoyuanlo/stpl)

\n\n[4]Two-shot Video Object Segmentation
\n[paper](https://arxiv.org/abs/2303.12078)

\n\n[3]Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation
\n[paper](https://arxiv.org/abs/2303.10100)

\n\n[2]MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation
\n[paper](https://arxiv.org/abs/2303.07815)

\n\n[1]InstMove: Instance Motion for Object-centric Video Segmentation
\n[paper](https://arxiv.org/abs/2303.08132) | [code](https://github.com/wjf5203/vnext)

\n\n
\n\n
\n\n \n\n### 抠图(Matting)\n\n
\n\n \n\n### 密集预测(Dense Prediction)\n\n[5]Probabilistic Prompt Learning for Dense Prediction
\n[paper](https://arxiv.org/abs/2304.00779)

\n\n[4]Ensemble-based Blackbox Attacks on Dense Prediction
\n[paper](https://arxiv.org/abs/2303.14304)

\n\n[3]Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection
\n[paper](https://arxiv.org/abs/2303.14960) | [code](https://github.com/PaddlePaddle/PaddleDetection)

\n\n[2]One-to-Few Label Assignment for End-to-End Dense Detection
\n[paper](https://arxiv.org/abs/2303.11567) | [code](https://github.com/strongwolf/o2f)

\n\n[1]DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction
\n[paper](https://arxiv.org/abs/2303.01573)

\n\n
\n\n
\n\n \n\n## 视频处理(Video Processing)\n\n[12]BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation
\n[paper](https://arxiv.org/abs/2304.02225) | [code](https://github.com/junheum/biformer)

\n\n[11]VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
\n[paper](https://arxiv.org/abs/2303.16727)

\n\n[10]Implicit View-Time Interpolation of Stereo Videos using Multi-Plane Disparities and Non-Uniform Coordinates
\n[paper](https://arxiv.org/abs/2303.17181)

\n\n[9]Affordance Grounding from Demonstration Video to Target Image
\n[paper](https://arxiv.org/abs/2303.14644) | [code](https://github.com/showlab/afformer)

\n\n[8]Frame Flexible Network
\n[paper](https://arxiv.org/abs/2303.14817) | [code](https://github.com/bespontaneous/ffn)

\n\n[7]Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time
\n[paper](https://arxiv.org/abs/2303.15043) | [code](https://github.com/shangwei5/vidue)

\n\n[6]A Unified Pyramid Recurrent Network for Video Frame Interpolation
\n[paper](https://arxiv.org/abs/2211.03456)

\n\n[5]Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior
\n[paper](https://arxiv.org/abs/2303.09757) | [code](https://github.com/jiaqixuac/map-net)

\n\n[4]Blind Video Deflickering by Neural Filtering with a Flawed Atlas
\n[paper](https://arxiv.org/abs/2303.08120) | [code](https://github.com/chenyanglei/all-in-one-deflicker)

\n\n[3]Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
\n[paper](https://arxiv.org/abs/2209.15280) | [code](https://github.com/tencentarc/tvts)

\n\n[2]UV Volumes for Real-time Rendering of Editable Free-view Human Performance
\n[paper]([arxiv.org](https://arxiv.org/abs/2203.14402)) | [code]([github.com](https://github.com/fanegg/UV-Volumes))

\n\n[1]Exploring Discontinuity for Video Frame Interpolation
\n[paper]([[2202.07291\\] Exploring Discontinuity for Video Frame Interpolation (arxiv.org)](https://arxiv.org/abs/2202.07291))

\n\n
\n\n
\n\n \n\n### 视频编辑(Video Editing)\n\n[4]VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs
\n[paper](https://arxiv.org/abs/2303.15893)

\n\n[3]Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
\n[paper](https://arxiv.org/abs/2212.02802)

\n\n[2]Text-Visual Prompting for Efficient 2D Temporal Video Grounding
\n[paper](https://arxiv.org/abs/2303.04995)

\n\n[1]Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
\n[paper](https://arxiv.org/abs/2303.00440) | [code](https://github.com/MCG-NJU/EMA-VFI)

\n\n
\n\n
\n\n \n\n### 视频生成/视频合成(Video Generation/Video Synthesis)\n\n[7]Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
\n[paper](https://arxiv.org/abs/2303.11251) | [code](https://github.com/Ugness/MeBT)

\n\n[6]Conditional Image-to-Video Generation with Latent Flow Diffusion Models
\n[paper](https://arxiv.org/abs/2303.13744)

\n\n[5]3D Cinemagraphy from a Single Image
\n[paper](https://arxiv.org/abs/2303.05724)

\n\n[4]VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
\n[paper](https://arxiv.org/abs/2303.08320) | [code](https://github.com/modelscope/modelscope)

\n\n[3]MOSO: Decomposing MOtion, Scene and Object for Video Prediction
\n[paper](https://arxiv.org/abs/2303.03684) | [code](https://github.com/anonymous202203/moso)

\n\n[2]SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
\n[paper](https://arxiv.org/abs/2211.12194) | [code](https://github.com/Winfredy/SadTalker)

\n\n[1]Video Probabilistic Diffusion Models in Projected Latent Space(投影潜在空间中的视频概率扩散模型)
\n[paper](https://arxiv.org/abs/2302.07685) | [project](https://sihyun.me/PVDM)

\n\n\n
\n\n
\n\n \n\n### 视频超分(Video Super-Resolution)\n\n[2]Structured Sparsity Learning for Efficient Video Super-Resolution
\n[paper](https://arxiv.org/abs/2206.07687)

\n\n[1]Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting
\n[paper](https://arxiv.org/abs/2303.08331)

\n\n
\n\n
\n\n \n\n## 估计(Estimation)\n\n[1]Learning the Distribution of Errors in Stereo Matching for Joint Disparity and Uncertainty Estimation
\n[paper](https://arxiv.org/abs/2304.00152)

\n\n
\n\n \n\n### 光流/运动估计(Optical Flow/Motion Estimation)\n\n[4]AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation
\n[paper](https://arxiv.org/abs/2303.16493)

\n\n[3]Semi-Weakly Supervised Object Kinematic Motion Prediction
\n[paper](https://arxiv.org/abs/2303.17774)

\n\n[2]DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling
\n[paper](https://arxiv.org/abs/2303.14078)

\n\n[1]Rethinking Optical Flow from Geometric Matching Consistent Perspective
\n[paper](https://arxiv.org/abs/2303.08384) | [code](https://github.com/dqiaole/matchflow)
\n\n
\n\n
\n\n
\n\n \n\n### 深度估计(Depth Estimation)\n\n[8]EGA-Depth: Efficient Guided Attention for Self-Supervised Multi-Camera Depth Estimation
\n[paper](https://arxiv.org/abs/2304.03369)

\n\n[7]DualRefine: Self-Supervised Depth and Pose Estimation Through Iterative Epipolar Sampling and Refinement Toward Equilibrium
\n[paper](https://arxiv.org/abs/2304.03560) | [code](https://github.com/antabangun/dualrefine)

\n\n[6]Single Image Depth Prediction Made Better: A Multivariate Gaussian Take
\n[paper](https://arxiv.org/abs/2303.18164)

\n\n[5]SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates
\n[paper](https://arxiv.org/abs/2303.13582)

\n\n[4]PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes
\n[paper](https://arxiv.org/abs/2210.01612) | [code](https://github.com/svip-lab/planedepth)

\n\n[3]HRDFuse: Monocular 360°Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions
\n[paper](https://arxiv.org/abs/2303.11616)

\n\n[2]Fully Self-Supervised Depth Estimation from Defocus Clue
\n[paper](https://arxiv.org/abs/2303.10752) | [code](https://github.com/ehzoahis/dered)

\n\n[1] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
\n[paper](https://arxiv.org/abs/2211.13202) | [code](https://github.com/noahzn/Lite-Mono)

\n\n
\n\n\n
\n\n \n\n### 人体解析/人体姿态估计(Human Parsing/Human Pose Estimation)\n\n[18]A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image
\n[paper](https://arxiv.org/abs/2304.03635) | [code](https://github.com/changlongjianggit/a2j-transformer)

\n\n[17]Monocular 3D Human Pose Estimation for Sports Broadcasts using Partial Sports Field Registration
\n[paper](https://arxiv.org/abs/2304.04437) | [code](https://github.com/tobibaum/partialsportsfieldreg_3dhpe)

\n\n[16]DeFeeNet: Consecutive 3D Human Motion Prediction with Deviation Feedback
\n[paper](https://arxiv.org/abs/2304.04496)

\n\n[15]TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation
\n[paper](https://arxiv.org/abs/2303.16730)

\n\n[14]PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation
\n[paper](https://arxiv.org/abs/2303.17472) | [code](https://github.com/qitaozhao/poseformerv2)

\n\n[13]ScarceNet: Animal Pose Estimation with Scarce Annotations
\n[paper](https://arxiv.org/abs/2303.15023) | [code](https://github.com/chaneyddtt/scarcenet)

\n\n[12]Human Pose Estimation in Extremely Low-Light Conditions
\n[paper](https://arxiv.org/abs/2303.15410)

\n\n[11]Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation
\n[paper](https://arxiv.org/abs/2303.11180)

\n\n[10]3D Human Mesh Estimation from Virtual Markers
\n[paper](https://arxiv.org/abs/2303.11726)

\n\n[9]Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation
\n[paper](https://arxiv.org/abs/2303.12246)

\n\n[8]Rigidity-Aware Detection for 6D Object Pose Estimation
\n[paper](https://arxiv.org/abs/2303.12396)

\n\n[7]Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video
\n[paper](https://arxiv.org/abs/2303.08475)

\n\n[6]Markerless Camera-to-Robot Pose Estimation via Self-supervised Sim-to-Real Transfer
\n[paper](https://arxiv.org/abs/2302.14338)

\n\n[5]TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation
\n[paper](https://arxiv.org/abs/2212.12902)

\n\n[4]Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting
\n[paper](https://arxiv.org/abs/2303.05095)

\n\n[3]PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation
\n[paper](https://arxiv.org/abs/2303.07337)

\n\n[2]DistilPose: Tokenized Pose Regression with Heatmap Distillation
\n[paper](https://arxiv.org/abs/2303.02455)

\n\n[1]Relightable Neural Human Assets from Multi-view Gradient Illuminations(来自多视图渐变照明的可照明神经人类资产)
\n[paper](https://arxiv.org/abs/2212.07648)

\n\n
\n\n
\n\n \n\n### 手势估计(Gesture Estimation)\n\n[6]CAMS: CAnonicalized Manipulation Spaces for Category-Level Functional Hand-Object Manipulation Synthesis
\n[paper](https://arxiv.org/abs/2303.15469)

\n\n[5]Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild
\n[paper](https://arxiv.org/abs/2303.13652)

\n\n[4]Natural Language-Assisted Sign Language Recognition
\n[paper](https://arxiv.org/abs/2303.12080) | [code](https://github.com/FangyunWei/SLRT)

\n\n[3]CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment
\n[paper](https://arxiv.org/abs/2303.05725) | [code](https://arxiv.org/abs/2303.05725)

\n\n[2]Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand Disentanglement
\n[paper](https://arxiv.org/abs/2303.01765)

\n\n[1]Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
\n[paper](https://arxiv.org/abs/2209.09484) | [code](https://github.com/fylwen/htt)

\n\n
\n\n\n
\n\n \n\n\n## 图像处理(Image Processing)\n\n[3]Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
\n[paper](https://arxiv.org/abs/2303.13779)

\n\n[2]PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment
\n[paper](https://arxiv.org/abs/2303.11526)

\n\n[1]DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
\n[paper](https://arxiv.org/abs/2303.06285) | [code](https://github.com/yueming6568/deltaedit)

\n\n
\n\n \n\n### 超分辨率(Super Resolution)\n\n[11]Better \"CMOS\" Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution
\n[paper](https://arxiv.org/abs/2304.03542)

\n\n[10]Implicit Diffusion Models for Continuous Super-Resolution
\n[paper](https://arxiv.org/abs/2303.16491) | [code](https://github.com/ree1s/idm)

\n\n[9]SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
\n[paper](https://arxiv.org/abs/2303.17605)

\n\n[8]Learning Generative Structure Prior for Blind Text Image Super-resolution
\n[paper](https://arxiv.org/abs/2303.14726) | [code](https://github.com/csxmli2016/marconet)

\n\n[7]Learning to Zoom and Unzoom
\n[paper](https://arxiv.org/abs/2303.15390)

\n\n[6]Activating More Pixels in Image Super-Resolution Transformer
\n[paper](https://arxiv.org/abs/2205.04437) | [code](https://github.com/chxy95/hat)

\n\n[5]Super-Resolution Neural Operator
\n[paper](https://arxiv.org/abs/2303.02584) | [code](https://github.com/2y7c3/super-resolution-neural-operator)

\n\n[4]Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution
\n[paper](https://arxiv.org/abs/2303.05156)

\n\n[3]Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation
\n[paper](https://arxiv.org/abs/2211.13676) | [code](https://github.com/seungho-snu/SROOE)

\n\n[2]N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution
\n[paper](https://arxiv.org/abs/2211.11436) | [code](https://github.com/rami0205/ngramswin)

\n\n[1]Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild(野外鲁棒图像超分辨率的去噪扩散概率模型)
\n[paper](https://arxiv.org/abs/2302.07864) | [project](https://sihyun.me/PVDM/)

\n\n
\n\n
\n\n \n\n### 图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)\n\n[20]CherryPicker: Semantic Skeletonization and Topological Reconstruction of Cherry Trees
\n[paper](https://arxiv.org/abs/2304.04708)

\n\n[19]Generative Diffusion Prior for Unified Image Restoration and Enhancement
\n[paper](https://arxiv.org/abs/2304.01247)

\n\n[18]CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects
\n[paper](https://arxiv.org/abs/2303.15782)

\n\n[17]HyperThumbnail: Real-time 6K Image Rescaling with Rate-distortion Optimization
\n[paper](https://arxiv.org/abs/2304.01064) | [code](https://github.com/abnervictor/hyperthumbnail)

\n\n[16]Burstormer: Burst Image Restoration and Enhancement Transformer
\n[paper](https://arxiv.org/abs/2304.01194)

\n\n[15]Visual-Tactile Sensing for In-Hand Object Reconstruction
\n[paper](https://arxiv.org/abs/2303.14498)

\n\n[14]3D-Aware Multi-Class Image-to-Image Translation with NeRFs
\n[paper](https://arxiv.org/abs/2303.15012) | [code](https://github.com/sen-mao/3di2i-translation)

\n\n[13]CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not
\n[paper](https://arxiv.org/abs/2303.13440)

\n\n[12]Instant Volumetric Head Avatars
\n[paper](https://arxiv.org/abs/2211.12499)

\n\n[11]Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank
\n[paper](https://arxiv.org/abs/2303.09101) | [code](https://github.com/huang-shirui/semi-uir)

\n\n[10]ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
\n[paper](https://arxiv.org/abs/2303.05938) | [code](https://github.com/zhengdiyu/arbitrary-hands-3d-reconstruction)

\n\n[9]Masked Image Modeling with Local Multi-Scale Reconstruction
\n[paper](https://arxiv.org/abs/2303.05251) | [code](https://github.com/huawei-noah/Efficient-Computing)

\n\n[8]Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective
\n[paper](https://arxiv.org/abs/2303.06859) | [code](https://github.com/lixinustc/Casual-IRDIL)

\n\n[7]DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration
\n[paper](https://arxiv.org/abs/2303.06885)

\n\n[6]Robust Unsupervised StyleGAN Image Restoration
\n[paper](https://arxiv.org/abs/2302.06733)

\n\n[5]Raw Image Reconstruction with Learned Compact Metadata
\n[paper](https://arxiv.org/abs/2302.12995)

\n\n[4]Efficient and Explicit Modelling of Image Hierarchies for Image Restoration
\n[paper](https://arxiv.org/abs/2303.00748) | [code](https://github.com/ofsoundof/GRL-Image-Restoration)

\n\n[3]Imagic: Text-Based Real Image Editing with Diffusion Models
\n[paper](https://arxiv.org/abs/2210.09276) | [project](https://imagic-editing.github.io/)

\n\n[2]High-resolution image reconstruction with latent diffusion models from human brain activity
\n[paper](https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2) | [project](https://sites.google.com/view/stablediffusion-with-brain/)

\n\n[1]Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models
\n[paper](https://arxiv.org/abs/2211.10655)

\n\n
\n\n
\n\n\n \n\n### 图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)\n\n[2]Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior
\n[paper](https://arxiv.org/abs/2303.15046) | [code](https://github.com/ykdai/BracketFlare)

\n\n[1]LightPainter: Interactive Portrait Relighting with Freehand Scribble
\n[paper](https://arxiv.org/abs/2303.12950)

\n\n
\n\n\n\n \n\n### 图像去噪/去模糊/去雨去雾(Image Denoising)\n\n[12]RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors
\n[paper](https://arxiv.org/abs/2304.03994) | [code](https://github.com/RQ-Wu/RIDCP_dehazing)

\n\n[11]HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering
\n[paper](https://arxiv.org/abs/2304.01686)

\n\n[10]Real-time Controllable Denoising for Image and Video
\n[paper](https://arxiv.org/abs/2303.16425)

\n\n[9]LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising
\n[paper](https://arxiv.org/abs/2304.00534) | [code](https://github.com/wang-xiaodingdd/lgbpn)

\n\n[8]Curricular Contrastive Regularization for Physics-aware Single Image Dehazing
\n[paper](https://arxiv.org/abs/2303.14218) | [code](https://github.com/yuzheng9/c2pnet)

\n\n[7]Spatially Adaptive Self-Supervised Learning for Real-World Image Denoising
\n[paper](https://arxiv.org/abs/2303.14934) | [code](https://github.com/nagejacob/spatiallyadaptivessid)

\n\n[6]Masked Image Training for Generalizable Deep Image Denoising
\n[paper](https://arxiv.org/abs/2303.13132) | [code](https://github.com/haoyuc/maskeddenoising)

\n\n[5]Learning A Sparse Transformer Network for Effective Image Deraining
\n[paper](https://arxiv.org/abs/2303.11950) | [code](https://github.com/cschenxiang/drsformer)

\n\n[4]Uncertainty-Aware Unsupervised Image Deblurring with Deep Residual Prior
\n[paper](https://arxiv.org/abs/2210.05361)

\n\n[3]Polarized Color Image Denoising using Pocoformer
\n[paper](https://arxiv.org/abs/2207.00215)

\n\n[2]Blur Interpolation Transformer for Real-World Motion from Blur
\n[paper](https://arxiv.org/abs/2211.11423) | [code](https://github.com/zzh-tech/BiT)

\n\n[1]Structured Kernel Estimation for Photon-Limited Deconvolution
\n[paper](https://arxiv.org/abs/2303.03472) | [code](https://github.com/sanghviyashiitb/structured-kernel-cvpr23)

\n\n
\n\n
\n\n \n\n### 图像编辑/图像修复(Image Edit/Inpainting)\n\n[6]SIEDOB: Semantic Image Editing by Disentangling Object and Background
\n[paper](https://arxiv.org/abs/2303.13062) | [code](https://github.com/wuyangluo/siedob)

\n\n[5]CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing
\n[paper](https://arxiv.org/abs/2303.05031)

\n\n[4]SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
\n[paper](https://arxiv.org/abs/2212.05034)

\n\n[3]Interactive Cartoonization with Controllable Perceptual Factors
\n[paper](https://arxiv.org/abs/2212.09555)

\n\n[2]Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint
\n[paper](https://arxiv.org/abs/2211.11448) | [code](https://github.com/kumapowerliu/clcae)

\n\n[1]LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data
\n[paper](https://arxiv.org/abs/2208.14889) | [code](https://github.com/KU-CVLAB/LANIT)

\n\n
\n\n
\n\n \n\n### 图像翻译(Image Translation)\n\n[1]Masked and Adaptive Transformer for Exemplar Based Image Translation
\n[paper](https://arxiv.org/abs/2303.17123) | [code](https://github.com/aiart-hdu/matebit)

\n\n
\n\n
\n\n \n\n### 图像质量评估(Image Quality Assessment)\n\n[3]Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild
\n[paper](https://arxiv.org/abs/2304.00451)

\n\n[2]CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability
\n[paper](https://arxiv.org/abs/2112.06592)

\n\n[1]Quality-aware Pre-trained Models for Blind Image Quality Assessment
\n[paper](https://arxiv.org/abs/2303.00521)

\n\n
\n\n \n\n### 风格迁移(Style Transfer)\n\n[4]CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
\n[paper](https://arxiv.org/abs/2303.17867)

\n\n[3]Neural Preset for Color Style Transfer
\n[paper](https://arxiv.org/abs/2303.13511) | [code](https://github.com/ZHKKKe/NeuralPreset)

\n\n[2]StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields
\n[paper](https://arxiv.org/abs/2303.10598)

\n\n[1]Fix the Noise: Disentangling Source Feature for Transfer Learning of StyleGAN
\n[paper](https://arxiv.org/abs/2204.14079) | [code](https://github.com/LeeDongYeun/FixNoise)

\n\n
\n\n
\n\n \n\n### 图像配准(Image Registration)\n\n[1]Indescribable Multi-modal Spatial Evaluator
\n[paper](https://arxiv.org/abs/2303.00369) | [code](https://github.com/Kid-Liet/IMSE/pulse)

\n\n
\n\n
\n\n \n\n## 人脸(Face)\n\n
\n\n
\n\n \n\n### 人脸识别/检测(Facial Recognition/Detection)\n\n[6]Gradient Attention Balance Network: Mitigating Face Recognition Racial Bias via Gradient Attention
\n[paper](https://arxiv.org/abs/2304.02284)

\n\n[5]Micron-BERT: BERT-based Facial Micro-Expression Recognition
\n[paper](https://arxiv.org/abs/2304.03195) | [code](https://github.com/uark-cviu/micron-bert)

\n\n[4]Towards Effective Adversarial Textured 3D Meshes on Physical Face Recognition
\n[paper](https://arxiv.org/abs/2303.15818)

\n\n[3]Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition
\n[paper](https://arxiv.org/abs/2303.12512)

\n\n[2]Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection
\n[paper](https://arxiv.org/abs/2303.08545)

\n\n[1]Multi Modal Facial Expression Recognition with Transformer-Based Fusion Networks and Dynamic Sampling
\n[paper](https://arxiv.org/abs/2303.08419)

\n\n
\n\n
\n\n \n\n### 人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)\n\n[13]GANHead: Towards Generative Animatable Neural Head Avatars
\n[paper](https://arxiv.org/abs/2304.03950)

\n\n[12]Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos
\n[paper](https://arxiv.org/abs/2304.01436)

\n\n[11]StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer
\n[paper](https://arxiv.org/abs/2304.02744)

\n\n[10]OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering
\n[paper](https://arxiv.org/abs/2303.14662)

\n\n[9]High-fidelity 3D Human Digitization from Single 2K Resolution Images
\n[paper](https://arxiv.org/abs/2303.15108)

\n\n[8]FaceLit: Neural 3D Relightable Faces
\n[paper](https://arxiv.org/abs/2303.15437)

\n\n[7]SunStage: Portrait Reconstruction and Relighting using the Sun as a Light Stage
\n[paper](https://arxiv.org/abs/2204.03648)

\n\n[6]MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
\n[paper](https://arxiv.org/abs/2212.08062) | [code](https://github.com/Meta-Portrait/MetaPortrait)

\n\n[5]NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images
\n[paper](https://arxiv.org/abs/2303.14092)

\n\n[4]Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images
\n[paper](https://arxiv.org/abs/2303.10896)

\n\n[3]Robust Model-based Face Reconstruction through Weakly-Supervised Outlier Segmentation
\n[paper](https://arxiv.org/abs/2106.09614) | [code](https://github.com/unibas-gravis/Occlusion-Robust-MoFA)

\n\n[2]A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images
\n[paper](https://arxiv.org/abs/2302.14434)

\n\n[1]MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation(MetaPortrait：具有快速个性化适应的身份保持谈话头像生成)
\n[paper](https://arxiv.org/abs/2212.08062) | [code](https://github.com/Meta-Portrait/MetaPortrait)

\n\n
\n\n
\n\n \n\n### 人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)\n\n[4]Hierarchical Fine-Grained Image Forgery Detection and Localization
\n[paper](https://arxiv.org/abs/2303.17111)

\n\n[3]Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment
\n[paper](https://arxiv.org/abs/2303.13662)

\n\n[2]Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization
\n[paper](https://arxiv.org/abs/2210.14457) | [code](https://github.com/megvii-research/caddm)

\n\n[1]Physical-World Optical Adversarial Attacks on 3D Face Recognition
\n[paper](https://arxiv.org/abs/2205.13412)

\n\n
\n\n\n
\n\n \n\n## 目标跟踪(Object Tracking)\n\n[11]Unsupervised Sampling Promoting for Stochastic Human Trajectory Prediction
\n[paper](https://arxiv.org/abs/2304.04298) | [code](https://github.com/viewsetting/unsupervised_sampling_promoting)

\n\n[10]Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion
\n[paper](https://arxiv.org/abs/2304.01893)

\n\n[9]Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction
\n[paper](https://arxiv.org/abs/2303.16005) | [code](https://github.com/colorfulfuture/gc-vrnn)

\n\n[8]Visibility Aware Human-Object Interaction Tracking from Single RGB Camera
\n[paper](https://arxiv.org/abs/2303.16479)

\n\n[7]DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks
\n[paper](https://arxiv.org/abs/2304.00571) | [code](https://github.com/jimmy-dq/dropmae)

\n\n[6]MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking
\n[paper](https://arxiv.org/abs/2303.10404)

\n\n[5]Visual Prompt Multi-Modal Tracking
\n[paper](https://arxiv.org/abs/2303.10826) | [code](https://github.com/jiawen-zhu/vipt)

\n\n[4]Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking
\n[paper](https://arxiv.org/abs/2203.14360v2) | [code](https://github.com/noahcao/OC_SORT)

\n\n[3]Focus On Details: Online Multi-object Tracking with Diverse Fine-grained Representation
\n[paper](https://arxiv.org/abs/2302.14589)

\n\n[2]Referring Multi-Object Tracking
\n[paper](https://arxiv.org/abs/2303.03366)

\n\n[1]Simple Cues Lead to a Strong Multi-Object Tracker
\n[paper](https://arxiv.org/abs/2206.04656)

\n\n
\n\n
\n \n\n## 图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)\n\n[17]Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
\n[paper](https://arxiv.org/abs/2304.05173)

\n\n[16]System-status-aware Adaptive Network for Online Streaming Video Understanding
\n[paper](https://arxiv.org/abs/2303.15742)

\n\n[15]Hierarchical Video-Moment Retrieval and Step-Captioning
\n[paper](https://arxiv.org/abs/2303.16406) | [code](https://github.com/j-min/HiREST)

\n\n[14]Procedure-Aware Pretraining for Instructional Video Understanding
\n[paper](https://arxiv.org/abs/2303.18230) | [code](https://github.com/salesforce/paprika)

\n\n[13]Use Your Head: Improving Long-Tail Video Recognition
\n[paper](https://arxiv.org/abs/2304.01143)

\n\n[12]Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style
\n[paper](https://arxiv.org/abs/2303.14348) | [code](https://github.com/buptlinfy/zse-sbir)

\n\n[11]Selective Structured State-Spaces for Long-Form Video Understanding
\n[paper](https://arxiv.org/abs/2303.14526)

\n\n[10]Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
\n[paper](https://arxiv.org/abs/2301.00184) | [code](https://github.com/whwu95/Cap4Video)

\n\n[9]NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
\n[paper](https://arxiv.org/abs/2301.00746)

\n\n[8]Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
\n[paper](https://arxiv.org/abs/2303.13800)

\n\n[7]Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
\n[paper](https://arxiv.org/abs/2303.13874)

\n\n[6]Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval
\n[paper](https://arxiv.org/abs/2303.12501) | [code](https://github.com/anosorae/irra)

\n\n[5]Dual-path Adaptation from Image to Video Transformers
\n[paper](https://arxiv.org/abs/2303.09857) | [code](https://github.com/park-jungin/dualpath)

\n\n[4]Data-Free Sketch-Based Image Retrieval
\n[paper](https://arxiv.org/abs/2303.07775)

\n\n[3]DAA: A Delta Age AdaIN operation for age estimation via binary code transformer
\n[paper](https://arxiv.org/abs/2303.07929)

\n\n[2]VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
\n[paper](https://arxiv.org/abs/2211.12764) | [code](https://github.com/bighuang624/vop)

\n\n[1]Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
\n[paper](https://arxiv.org/abs/2206.02082)

\n\n
\n\n\n\n \n\n### 行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)\n\n[16]Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition
\n[paper](https://arxiv.org/abs/2303.15467)

\n\n[15]STMixer: A One-Stage Sparse Action Detector
\n[paper](https://arxiv.org/abs/2303.15879)

\n\n[14]TimeBalance: Temporally-Invariant and Temporally-Distinctive Video Representations for Semi-Supervised Action Recognition
\n[paper](https://arxiv.org/abs/2303.16268)

\n\n[13]Decomposed Cross-modal Distillation for RGB-based Temporal Action Detection
\n[paper](https://arxiv.org/abs/2303.17285)

\n\n[12]STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
\n[paper](https://arxiv.org/abs/2303.18177)

\n\n[11]MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition
\n[paper](https://arxiv.org/abs/2304.00946)

\n\n[10]On the Benefits of 3D Pose and Tracking for Human Action Recognition
\n[paper](https://arxiv.org/abs/2304.01199)

\n\n[9]3Mformer: Multi-order Multi-mode Transformer for Skeletal Action Recognition
\n[paper](https://arxiv.org/abs/2303.14474)

\n\n[8]Box-Level Active Detection
\n[paper](https://arxiv.org/abs/2303.13089)

\n\n[7]Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based Action Recognition
\n[paper](https://arxiv.org/abs/2303.10904)

\n\n[6]Open Set Action Recognition via Multi-Label Evidential Learning
\n[paper](https://arxiv.org/abs/2303.12698)

\n\n[5]Video Test-Time Adaptation for Action Recognition
\n[paper](https://arxiv.org/abs/2211.15393)

\n\n[4]Post-Processing Temporal Action Detection
\n[paper](https://arxiv.org/abs/2211.14924)

\n\n[3]TriDet: Temporal Action Detection with Relative Boundary Modeling
\n[paper](https://arxiv.org/abs/2303.07347) | [code](https://github.com/sssste/TriDet)

\n\n[2]Learning Discriminative Representations for Skeleton Based Action Recognition
\n[paper](https://arxiv.org/abs/2303.03729)

\n\n[1]Continuous Sign Language Recognition with Correlation Network
\n[paper](https://arxiv.org/abs/2303.03202) | [code](https://github.com/hulianyuyy/CorrNet)

\n\n
\n\n \n\n### 行人重识别/检测(Re-Identification/Detection)\n\n[7]Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification
\n[paper](https://arxiv.org/abs/2304.04205) | [code](https://github.com/jiawei151/sgiel_vireid)

\n\n[6]PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identification
\n[paper](https://arxiv.org/abs/2304.01537)

\n\n[5]Large-scale Training Data Search for Object Re-identification
\n[paper](https://arxiv.org/abs/2303.16186)

\n\n[4]Adaptive Sparse Pairwise Loss for Object Re-Identification
\n[paper](https://arxiv.org/abs/2303.18247) | [code](https://github.com/astaxanthin/adasp)

\n\n[3]Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification
\n[paper](https://arxiv.org/abs/2303.14481) | [code](https://github.com/zyk100/llcm)

\n\n[2]TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification
\n[paper](https://arxiv.org/abs/2303.06819) | [code](https://github.com/kali-hac/transg)

\n\n[1]MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID
\n[paper](https://arxiv.org/abs/2303.07065) | [code](https://github.com/vimar-gu/MSINet)

\n\n
\n\n \n\n### 图像/视频字幕(Image/Video Caption)\n\n[8]Model-Agnostic Gender Debiased Image Captioning
\n[paper](https://arxiv.org/abs/2304.03693)

\n\n[7]Cross-Domain Image Captioning with Discriminative Finetuning
\n[paper](https://arxiv.org/abs/2304.01662) | [code](https://github.com/facebookresearch/EGG)

\n\n[6]AutoAD: Movie Description in Context
\n[paper](https://arxiv.org/abs/2303.16899) | [code](https://github.com/Soldelli/MAD)

\n\n[5]Text with Knowledge Graph Augmented Transformer for Video Captioning
\n[paper](https://arxiv.org/abs/2303.12423)

\n\n[4]Dual-Stream Transformer for Generic Event Boundary Captioning
\n[paper](https://arxiv.org/abs/2207.03038) | [code](https://github.com/gx77/dual-stream-transformer-for-generic-event-boundary-captioning)

\n\n[3]ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
\n[paper](https://arxiv.org/abs/2303.02437) | [code](https://github.com/joeyz0z/conzic)

\n\n[2]Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
\n[paper](https://arxiv.org/abs/2303.00040)

\n\n[1]Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
\n[paper](https://arxiv.org/abs/2302.14115) | [code](https://antoyang.github.io/vid2seq.html)

\n\n
\n\n\n \n\n## 医学影像(Medical Imaging)\n\n[15]Deep Prototypical-Parts Ease Morphological Kidney Stone Identification and are Competitively Robust to Photometric Perturbations
\n[paper](https://arxiv.org/abs/2304.04077) | [code](https://github.com/danielf29/prototipical_parts)

\n\n[14]Coherent Concept-based Explanations in Medical Image and Its Application to Skin Lesion Diagnosis
\n[paper](https://arxiv.org/abs/2304.04579) | [code](https://github.com/cristianopatricio/coherent-cbe-skin)

\n\n[13]Topology-Guided Multi-Class Cell Context Generation for Digital Pathology
\n[paper](https://arxiv.org/abs/2304.02255)

\n\n[12]Fair Federated Medical Image Segmentation via Client Contribution Estimation
\n[paper](https://arxiv.org/abs/2303.16520)

\n\n[11]Directional Connectivity-based Segmentation of Medical Images
\n[paper](https://arxiv.org/abs/2304.00145) | [code](https://github.com/zyun-y/dconnnet)

\n\n[10]Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization
\n[paper](https://arxiv.org/abs/2304.00212)

\n\n[9]Label-Free Liver Tumor Segmentation
\n[paper](https://arxiv.org/abs/2303.14869) | [code](https://github.com/mrgiovanni/synthetictumors)

\n\n[8]Image Quality-aware Diagnosis via Meta-knowledge Co-embedding
\n[paper](https://arxiv.org/abs/2303.15038)

\n\n[7]RepMode: Learning to Re-parameterize Diverse Experts for Subcellular Structure Prediction
\n[paper](https://arxiv.org/abs/2212.10066) | [code](https://github.com/Correr-Zhou/RepMode)

\n\n[6]Orthogonal Annotation Benefits Barely-supervised Medical Image Segmentation
\n[paper](https://arxiv.org/abs/2303.13090) | [code](https://github.com/hengcai-nju/desco)

\n\n[5]Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification
\n[paper](https://arxiv.org/abs/2303.08446)

\n\n[4]Neuron Structure Modeling for Generalizable Remote Physiological Measurement
\n[paper](https://arxiv.org/abs/2303.05955) | [code](https://github.com/lupaopao/nest)

\n\n[3]Unsupervised Contour Tracking of Live Cells by Mechanical and Cycle Consistency Losses
\n[paper](https://arxiv.org/abs/2303.08364) | [code](https://github.com/junbongjang/contour-tracking)

\n\n[2]Deep Feature In-painting for Unsupervised Anomaly Detection in X-ray Images
\n[paper](https://arxiv.org/pdf/2111.13495.pdf) | [code](https://github.com/tiangexiang/SQUID)

\n\n[1]Label-Free Liver Tumor Segmentation
\n[paper](https://arxiv.org/pdf/2210.14845.pdf) | [code](https://github.com/MrGiovanni/SyntheticTumors)

\n\n
\n\n\n \n\n\n## 文本检测/识别/理解(Text Detection/Recognition/Understanding)\n\n[7]Towards Unified Scene Text Spotting based on Sequence Generation
\n[paper](https://arxiv.org/abs/2304.03435)

\n\n[6]Images Speak in Images: A Generalist Painter for In-Context Visual Learning
\n[paper](https://arxiv.org/abs/2212.02499) | [code](https://github.com/baaivision/painter)

\n\n[5]Context De-confounded Emotion Recognition
\n[paper](https://arxiv.org/abs/2303.11921)

\n\n[4]Joint Visual Grounding and Tracking with Natural Language Specification
\n[paper](https://arxiv.org/abs/2303.12027)

\n\n[3]Unifying Vision, Text, and Layout for Universal Document Processing
\n[paper](https://arxiv.org/abs/2212.02623)

\n\n[2]Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling
\n[paper](https://arxiv.org/abs/2303.06949)

\n\n[1]DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
\n[paper](https://arxiv.org/pdf/2211.10772v3.pdf) | [code](https://github.com/ViTAE-Transformer/DeepSolo)

\n\n
\n\n\n\n \n\n## 遥感图像(Remote Sensing Image)\n\n
\n\n\n \n\n## GAN/生成式/对抗式(GAN/Generative/Adversarial)\n\n[7]Fine-Grained Face Swapping via Regional GAN Inversion
\n[paper](https://arxiv.org/abs/2211.14068)

\n\n[6]Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models
\n[paper](https://arxiv.org/abs/2303.10774)

\n\n[5]Graph Transformer GANs for Graph-Constrained House Generation
\n[paper](https://arxiv.org/abs/2303.08225)

\n\n[4]Improving GAN Training via Feature Space Shrinkage
\n[paper](https://arxiv.org/abs/2303.01559) | [code](https://github.com/wentianzhang-ml/adaptivemix)

\n\n[3]Adversarial Attack with Raindrops
\n[paper](https://arxiv.org/abs/2302.14267)

\n\n[2]T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations
\n[paper](https://arxiv.org/abs/2301.06052) | [project](https://mael-zys.github.io/T2M-GPT/)

\n\n[1]Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
\n[paper](https://arxiv.org/abs/2211.11208) | [project](https://mrtornado24.github.io/Next3D/)

\n\n\n
\n\n \n\n## 图像生成/图像合成(Image Generation/Image Synthesis)\n\n[30]Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation
\n[paper](https://arxiv.org/abs/2304.01816)

\n\n[29]Few-shot Semantic Image Synthesis with Class Affinity Transfer
\n[paper](https://arxiv.org/abs/2304.02321)

\n\n[28]Variational Distribution Learning for Unsupervised Text-to-Image Generation
\n[paper](https://arxiv.org/abs/2303.16105)

\n\n[27]HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images
\n[paper](https://arxiv.org/abs/2303.16509)

\n\n[26]LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation
\n[paper](https://arxiv.org/abs/2303.17189) | [code](https://github.com/zgctroy/layoutdiffusion)

\n\n[25]Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
\n[paper](https://arxiv.org/abs/2303.17480) | [code](https://github.com/sxjdwang/talklip)

\n\n[24]Unsupervised Domain Adaption with Pixel-level Discriminator for Image-aware Layout Generation
\n[paper](https://arxiv.org/abs/2303.14377)

\n\n[23]Freestyle Layout-to-Image Synthesis
\n[paper](https://arxiv.org/abs/2303.14412) | [code](https://github.com/essunny310/freestylenet)

\n\n[22]All are Worth Words: A ViT Backbone for Diffusion Models
\n[paper](https://arxiv.org/abs/2209.12152) | [code](https://github.com/baofff/U-ViT\t)

\n\n[21]Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
\n[paper](https://arxiv.org/abs/2211.11208) | [code](https://github.com/MrTornado24/FENeRF)

\n\n[20]Shifted Diffusion for Text-to-image Generation
\n[paper](https://arxiv.org/abs/2211.15388) | [code](https://github.com/drboog/Shifted_Diffusion)

\n\n[19]Towards Practical Plug-and-Play Diffusion Models
\n[paper](https://arxiv.org/abs/2212.05973)

\n\n[18]Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis
\n[paper](https://arxiv.org/abs/2303.14157)

\n\n[17]Wavelet Diffusion Models are fast and scalable Image Generators
\n[paper](https://arxiv.org/abs/2211.16152) | [code](https://github.com/vinairesearch/wavediff)

\n\n[16]Learning 3D-aware Image Synthesis with Unknown Pose Distribution
\n[paper](https://arxiv.org/abs/2301.07702)

\n\n[15]Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
\n[paper](https://arxiv.org/abs/2303.11162)

\n\n[14]3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process
\n[paper](https://arxiv.org/abs/2303.10406) | [code](https://github.com/colorful-liyu/3dqd)

\n\n[13]A Dynamic Multi-Scale Voxel Flow Network for Video Prediction
\n[paper](https://arxiv.org/abs/2303.09875) | [code](https://github.com/megvii-research/CVPR2023-DMVFN)

\n\n[12]Regularized Vector Quantization for Tokenized Image Synthesis
\n[paper](https://arxiv.org/abs/2303.06424)

\n\n[11]SpaText: Spatio-Textual Representation for Controllable Image Generation
\n[paper](https://arxiv.org/abs/2211.14305)

\n\n[10]Unifying Layout Generation with a Decoupled Diffusion Model
\n[paper](https://arxiv.org/abs/2303.05049)

\n\n[9]Scaling up GANs for Text-to-Image Synthesis
\n[paper](https://arxiv.org/abs/2303.05511)

\n\n[8]Inversion-Based Style Transfer with Diffusion Models
\n[paper](https://arxiv.org/abs/2211.13203) | [code](https://github.com/zyxelsa/InST)

\n\n[7]Perspective Fields for Single Image Camera Calibration
\n[paper](https://arxiv.org/abs/2212.03239)

\n\n[6]VGFlow: Visibility guided Flow Network for Human Reposing
\n[paper](https://arxiv.org/abs/2211.08540)

\n\n[5]DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
\n[paper](https://arxiv.org/abs/2208.12242) | [code](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/ppdiffusers/examples/dreambooth)

\n\n[4]Progressive Open Space Expansion for Open-Set Model Attribution
\n[paper](https://arxiv.org/abs/2303.06877) | [code](https://github.com/tianyunyoung/pose)

\n\n[3]Person Image Synthesis via Denoising Diffusion Model
\n[paper](https://arxiv.org/abs/2211.12500)

\n\n[2]Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models(使用预训练的 2D 扩散模型解决 3D 逆问题)
\n[paper](https://arxiv.org/abs/2211.10655)

\n\n[1]Parallel Diffusion Models of Operator and Image for Blind Inverse Problems(盲反问题算子和图像的并行扩散模型)
\n[paper](https://arxiv.org/abs/2211.10656)

\n\n
\n\n \n\n## 三维视觉(3D Vision)\n\n[3]LinK: Linear Kernel for LiDAR-based 3D Perception
\n[paper](https://arxiv.org/abs/2303.16094)

\n\n[2]Learning a 3D Morphable Face Reflectance Model from Low-cost Data
\n[paper](https://arxiv.org/abs/2303.11686) | [code](https://github.com/yxuhan/reflectancemm)

\n\n[1]Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
\n[paper](https://arxiv.org/abs/2302.07817) | [code](https://github.com/wzzheng/tpvformer)

\n\n
\n\n \n\n### 点云(Point Cloud)\n\n[26]MEnsA: Mix-up Ensemble Average for Unsupervised Multi Target Domain Adaptation on 3D Point Clouds
\n[paper](https://arxiv.org/abs/2304.01554) | [code](https://github.com/sinashish/mensa_mtda)

\n\n[25]Binarizing Sparse Convolutional Networks for Efficient Point Cloud Analysis
\n[paper](https://arxiv.org/abs/2303.15493)

\n\n[24]Self-positioning Point-based Transformer for Point Cloud Understanding
\n[paper](https://arxiv.org/abs/2303.16450) | [code](https://github.com/mlvlab/spotr)

\n\n[23]NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud
\n[paper](https://arxiv.org/abs/2303.16465)

\n\n[22]PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations
\n[paper](https://arxiv.org/abs/2303.16958)

\n\n[21]Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation
\n[paper](https://arxiv.org/abs/2303.17167) | [code](https://github.com/hikvision-research/3dvision)

\n\n[20]Learning Human-to-Robot Handovers from Point Clouds
\n[paper](https://arxiv.org/abs/2303.17592)

\n\n[19]Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting
\n[paper](https://arxiv.org/abs/2304.00467) | [code](https://github.com/whu-usi3dv/sghr)

\n\n[18]Unsupervised Inference of Signed Distance Functions from Single Sparse Point Clouds without Learning Priors
\n[paper](https://arxiv.org/abs/2303.14505)

\n\n[17]NeuralPCI: Spatio-temporal Neural Field for 3D Point Cloud Multi-frame Non-linear Interpolation
\n[paper](https://arxiv.org/abs/2303.15126) | [code](https://github.com/ispc-lab/neuralpci)

\n\n[16]Recognizing Rigid Patterns of Unlabeled Point Clouds by Complete and Continuous Isometry Invariants with no False Negatives and no False Positives
\n[paper](https://arxiv.org/abs/2303.15385)

\n\n[15]CLIP2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data
\n[paper](https://arxiv.org/abs/2303.12417)

\n\n[14]Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration
\n[paper](https://arxiv.org/abs/2303.13290) | [code](https://github.com/gfmei/udpreg)

\n\n[13]Deep Graph-based Spatial Consistency for Robust Non-rigid Point Cloud Registration
\n[paper](https://arxiv.org/abs/2303.09950) | [code](https://github.com/qinzheng93/graphscnet)

\n\n[12]Controllable Mesh Generation Through Sparse Latent Point Diffusion Models
\n[paper](https://arxiv.org/abs/2303.07938)

\n\n[11]Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis
\n[paper](https://arxiv.org/abs/2303.08134) | [code](https://github.com/zrrskywalker/point-nn)

\n\n[10]Rotation-Invariant Transformer for Point Cloud Matching
\n[paper](https://arxiv.org/abs/2303.08231)

\n\n[9]GraVoS: Voxel Selection for 3D Point-Cloud Detection
\n[paper](https://arxiv.org/abs/2208.08780)

\n\n[8]DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
\n[paper](https://arxiv.org/abs/2301.06051) | [code](https://github.com/haiyang-w/dsvt)

\n\n[7]PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees
\n[paper](https://arxiv.org/abs/2303.01959)

\n\n[6]ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion
\n[paper](https://arxiv.org/abs/2303.01979) | [code](https://github.com/sangminhong/acl-spc_pytorch)

\n\n[5]DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization
\n[paper](https://arxiv.org/abs/2212.06331)

\n\n[4]Frequency-Modulated Point Cloud Rendering with Easy Editing
\n[paper](https://arxiv.org/abs/2303.07596v1)

\n\n[3]Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
\n[paper](https://arxiv.org/abs/2301.05709)

\n\n[2]ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer
\n[paper](https://arxiv.org/abs/2302.14435) | [code](https://github.com/I2-Multimedia-Lab/ProxyFormer)

\n\n[1]Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting
\n[paper](https://arxiv.org/abs/2302.13130) | [code](https://github.com/tarashakhurana/4d-occ-forecasting)

\n\n
\n\n\n \n\n### 三维重建(3D Reconstruction)\n\n[28]Multi-View Azimuth Stereo via Tangent Space Consistency
\n[paper](https://arxiv.org/abs/2303.16447) | [code](https://github.com/xucao-42/mvas)

\n\n[27]3D Line Mapping Revisited
\n[paper](https://arxiv.org/abs/2303.17504) | [code](https://github.com/cvg/limap)

\n\n[26]PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters
\n[paper](https://arxiv.org/abs/2303.14587) | [code](https://github.com/shuhongchen/panic3d-anime-reconstruction)

\n\n[25]HexPlane: A Fast Representation for Dynamic Scenes
\n[paper](https://arxiv.org/abs/2301.09632)

\n\n[24]Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container
\n[paper](https://arxiv.org/abs/2303.13805)

\n\n[23]BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
\n[paper](https://arxiv.org/abs/2303.14158)

\n\n[22]Structured 3D Features for Reconstructing Controllable Avatars
\n[paper](https://arxiv.org/abs/2212.06820)

\n\n[21]PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360∘
\n[paper](https://arxiv.org/abs/2303.13071)

\n\n[20]Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization
\n[paper](https://arxiv.org/abs/2303.13232)

\n\n[19]TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
\n[paper](https://arxiv.org/abs/2303.13273) | [code](https://github.com/plusmultiply/taps3d)

\n\n[18]MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
\n[paper](https://arxiv.org/abs/2303.13510) | [code](https://github.com/smartbot-pjlab/mv-jar)

\n\n[17]PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision
\n[paper](https://arxiv.org/abs/2303.09554)

\n\n[16]SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
\n[paper](https://arxiv.org/abs/2212.04493) | [code](https://github.com/yccyenchicheng/SDFusion)

\n\n[15]Masked Wavelet Representation for Compact Neural Radiance Fields
\n[paper](https://arxiv.org/abs/2212.09069)

\n\n[14]Decoupling Human and Camera Motion from Videos in the Wild
\n[paper](https://arxiv.org/abs/2302.12827)

\n\n[13]Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
\n[paper](https://arxiv.org/abs/2303.05937)

\n\n[12]NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images
\n[paper](https://arxiv.org/abs/2303.07653)

\n\n[11]Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion
\n[paper](https://arxiv.org/abs/2211.11674) | [code](https://github.com/google-research/nerf-from-image)

\n\n[10]MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices
\n[paper](https://arxiv.org/abs/2303.01932) | [code](https://github.com/ActiveVisionLab/MobileBrick)

\n\n[9]Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly
\n[paper](https://arxiv.org/abs/2303.01999)

\n\n[8]NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
\n[paper](https://arxiv.org/abs/2303.02375)

\n\n[7]HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling
\n[paper](https://arxiv.org/abs/2303.02700)

\n\n[6]MACARONS: Mapping And Coverage Anticipation with RGB Online Self-Supervision
\n[paper](https://arxiv.org/abs/2303.03315)

\n\n[4]Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness
\n[paper](https://arxiv.org/abs/2303.00971) | [code](https://github.com/zhijieshen-bjtu/DOPNet)

\n\n[3]Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes
\n[paper](https://arxiv.org/abs/2302.14348) | [code](https://github.com/jyunlee/Im2Hands)

\n\n[2]ECON: Explicit Clothed humans Obtained from Normals
\n[paper](https://arxiv.org/abs/2212.07422) | [code](https://github.com/YuliangXiu/ECON)

\n\n[1]Structured 3D Features for Reconstructing Relightable and Animatable Avatars
\n[paper](https://arxiv.org/abs/2212.06820) | [project](https://enriccorona.github.io/s3f/)

\n\n
\n\n \n\n### 场景重建/视图合成/新视角合成(Novel View Synthesis)\n\n[51]Lift3D: Synthesize 3D Training Data by Lifting 2D GAN to 3D Generative Radiance Field
\n[paper](https://arxiv.org/abs/2304.03526)

\n\n[50]POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo
\n[paper](https://arxiv.org/abs/2304.04038) | [code](https://github.com/lixiny/poem)

\n\n[49]Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
\n[paper](https://arxiv.org/abs/2304.04452)

\n\n[48]Neural Lens Modeling
\n[paper](https://arxiv.org/abs/2304.04848)

\n\n[47]One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field
\n[paper](https://arxiv.org/abs/2304.05097)

\n\n[46]MonoHuman: Animatable Human Neural Field from Monocular Video
\n[paper](https://arxiv.org/abs/2304.02001)

\n\n[45]GINA-3D: Learning to Generate Implicit Neural Assets in the Wild
\n[paper](https://arxiv.org/abs/2304.02163)

\n\n[44]Neural Fields meet Explicit Geometric Representation for Inverse Rendering of Urban Scenes
\n[paper](https://arxiv.org/abs/2304.03266)

\n\n[43]F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories
\n[paper](https://arxiv.org/abs/2303.15951)

\n\n[42]NeFII: Inverse Rendering for Reflectance Decomposition with Near-Field Indirect Illumination
\n[paper](https://arxiv.org/abs/2303.16617)

\n\n[41]Enhanced Stable View Synthesis
\n[paper](https://arxiv.org/abs/2303.17094)

\n\n[40]Consistent View Synthesis with Pose-Guided Diffusion Models
\n[paper](https://arxiv.org/abs/2303.17598)

\n\n[39]NeRF-Supervised Deep Stereo
\n[paper](https://arxiv.org/abs/2303.17603) | [code](https://github.com/fabiotosi92/nerf-supervised-deep-stereo)

\n\n[38]Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations
\n[paper](https://arxiv.org/abs/2303.18139)

\n\n[37]DyLiN: Making Light Field Networks Dynamic
\n[paper](https://arxiv.org/abs/2303.14243)

\n\n[36]FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views
\n[paper](https://arxiv.org/abs/2303.14368)

\n\n[35]NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects
\n[paper](https://arxiv.org/abs/2303.14435) | [code](https://github.com/jokeryan/nerf-ds)

\n\n[34]SUDS: Scalable Urban Dynamic Scenes
\n[paper](https://arxiv.org/abs/2303.14536)

\n\n[33]JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields
\n[paper](https://arxiv.org/abs/2303.15427)

\n\n[32]Magic3D: High-Resolution Text-to-3D Content Creation
\n[paper](https://arxiv.org/abs/2211.10440)

\n\n[31]DiffRF: Rendering-Guided 3D Radiance Field Diffusion
\n[paper](https://arxiv.org/abs/2212.01206)

\n\n[30]Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields for Controllable Scene Stylization
\n[paper](https://arxiv.org/abs/2212.02766) | [code](https://github.com/dvlab-research/ref-npr)

\n\n[29]Interactive Segmentation of Radiance Fields
\n[paper](https://arxiv.org/abs/2212.13545)

\n\n[28]MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation
\n[paper](https://arxiv.org/abs/2303.12368)

\n\n[27]GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images
\n[paper](https://arxiv.org/abs/2303.13777)

\n\n[26]Progressively Optimized Local Radiance Fields for Robust View Synthesis
\n[paper](https://arxiv.org/abs/2303.13791)

\n\n[25]ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field
\n[paper](https://arxiv.org/abs/2303.13817)

\n\n[24]HandNeRF: Neural Radiance Fields for Animatable Interacting Hands
\n[paper](https://arxiv.org/abs/2303.13825)

\n\n[23]Grid-guided Neural Radiance Fields for Large Urban Scenes
\n[paper](https://arxiv.org/abs/2303.14001)

\n\n[22]EventNeRF: Neural Radiance Fields from a Single Colour Event Camera
\n[paper](https://arxiv.org/abs/2206.11896)

\n\n[21]SPARF: Neural Radiance Fields from Sparse and Noisy Poses
\n[paper](https://arxiv.org/abs/2211.11738)

\n\n[20]RUST: Latent Neural Scene Representations from Unposed Imagery
\n[paper](https://arxiv.org/abs/2211.14306)

\n\n[19]SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field
\n[paper](https://arxiv.org/abs/2303.13277)

\n\n[18]ShadowNeuS: Neural SDF Reconstruction by Shadow Ray Supervision
\n[paper](https://arxiv.org/abs/2211.14086) | [code](https://github.com/gerwang/shadowneus)

\n\n[17]Balanced Spherical Grid for Egocentric View Synthesis
\n[paper](https://arxiv.org/abs/2303.12408) | [code](https://github.com/changwoonchoi/EgoNeRF)

\n\n[16]Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention
\n[paper](https://arxiv.org/abs/2303.13014)

\n\n[15]MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures
\n[paper](https://arxiv.org/abs/2208.00277) | [code](https://github.com/google-research/jax3d)

\n\n[14]Robust Dynamic Radiance Fields
\n[paper](https://arxiv.org/abs/2301.02239)

\n\n[13]I2-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs
\n[paper](https://arxiv.org/abs/2303.07634)

\n\n[12]Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image
\n[paper](https://arxiv.org/abs/2211.13901)

\n\n[11]Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision
\n[paper](https://arxiv.org/abs/2303.03361)

\n\n[10]Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields
\n[paper](https://arxiv.org/abs/2211.11505)

\n\n[9]DP-NeRF: Deblurred Neural Radiance Field with Physical Scene Priors
\n[paper](https://arxiv.org/abs/2211.12046) | [code](https://github.com/dogyoonlee/dp-nerf)

\n\n[8]SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields
\n[paper](https://arxiv.org/abs/2211.12254)

\n\n[7]3D Video Loops from Asynchronous Input
\n[paper](https://arxiv.org/abs/2303.05312) | [code](https://github.com/limacv/VideoLoop3D)

\n\n[6]NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer
\n[paper](https://arxiv.org/abs/2303.06919) | [code](https://t.co/uNiTd9ujCv)

\n\n[5]NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation
\n[paper](https://arxiv.org/abs/2212.14710)

\n\n[4]Renderable Neural Radiance Map for Visual Navigation
\n[paper](https://arxiv.org/abs/2303.00304)

\n\n[3]Real-Time Neural Light Field on Mobile Devices
\n[paper](https://arxiv.org/abs/2212.08057) | [project](https://snap-research.github.io/MobileR2L/)

\n\n[2]Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
\n[paper](https://arxiv.org/abs/2211.07600) | [code](https://github.com/eladrich/latent-nerf)

\n\n[1]NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior
\n[paper](https://arxiv.org/abs/2212.07388) | [project](https://nope-nerf.active.vision/)

\n\n
\n\n \n\n## 模型压缩(Model Compression)\n\n[1]Neural Video Compression with Diverse Contexts\t
\n[paper](http://arxiv.org/abs/2302.14402) | [code](https://github.com/microsoft/dcvc)

\n\n
\n\n \n\n### 知识蒸馏(Knowledge Distillation)\n\n[7]Supervised Masked Knowledge Distillation for Few-Shot Transformers
\n[paper](https://arxiv.org/abs/2303.15466) | [code](https://github.com/hl-hanlin/smkd)

\n\n[6]DisWOT: Student Architecture Search for Distillation WithOut Training
\n[paper](https://arxiv.org/abs/2303.15678)

\n\n[5]KD-DLGAN: Data Limited Image Generation via Knowledge Distillation
\n[paper](https://arxiv.org/abs/2303.17158)

\n\n[4]Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
\n[paper](https://arxiv.org/abs/2303.14666)

\n\n[3]Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation
\n[paper](https://arxiv.org/abs/2302.14290)

\n\n[2]Generic-to-Specific Distillation of Masked Autoencoders
\n[paper](https://arxiv.org/abs/2302.14771) | [code](https://github.com/pengzhiliang/G2SD)

\n\n[1]CLIPPING: Distilling CLIP-based Models for Video-Language Understanding(CLIPPING：为视频语言理解提炼基于 CLIP 的模型)
\n[paper](https://openreview.net/forum?id=aqIvCsRsYt)

\n\n
\n\n \n\n### 剪枝(Pruning)\n\n[2]CP3: Channel Pruning Plug-in for Point-based Networks
\n[paper](https://arxiv.org/abs/2303.13097)

\n\n[1]DepGraph: Towards Any Structural Pruning
\n[paper](https://arxiv.org/abs/2301.12900) | [code](https://github.com/VainF/Torch-Pruning)

\n\n
\n\n \n\n### 量化(Quantization)\n\n[4]Hard Sample Matters a Lot in Zero-Shot Quantization
\n[paper](https://arxiv.org/abs/2303.13826)

\n\n[3]Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
\n[paper](https://arxiv.org/abs/2303.11906)

\n\n[2]Post-training Quantization on Diffusion Models
\n[paper](https://arxiv.org/abs/2211.15736) | [code](https://github.com/42shawn/ptq4dm)

\n\n[1]Adaptive Data-Free Quantization
\n[paper](https://arxiv.org/abs/2303.06869) | [code](https://github.com/hfutqian/AdaDFQ)

\n\n
\n\n \n\n## 神经网络结构设计(Neural Network Structure Design)\n\n[9]SMPConv: Self-moving Point Representations for Continuous Convolution
\n[paper](https://arxiv.org/abs/2304.02330) | [code](https://github.com/sangnekim/smpconv)

\n\n[8]Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection
\n[paper](https://arxiv.org/abs/2303.14404) | [code](https://github.com/akhtarvision/bpc_calibration)

\n\n[7]Compacting Binary Neural Networks by Sparse Kernel Selection
\n[paper](https://arxiv.org/abs/2303.14470)

\n\n[6]LINe: Out-of-Distribution Detection by Leveraging Important Neurons
\n[paper](https://arxiv.org/abs/2303.13995)

\n\n[5]Towards Scalable Neural Representation for Diverse Videos
\n[paper](https://arxiv.org/abs/2303.14124)

\n\n[4]Boundary Unlearning
\n[paper](https://arxiv.org/abs/2303.11570)

\n\n[3]Equiangular Basis Vectors
\n[paper](https://arxiv.org/abs/2303.11637) | [code](https://github.com/njust-vipgroup/equiangular-basis-vectors)

\n\n[2]LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
\n[paper](https://arxiv.org/abs/2206.10555) | [code](https://github.com/dvlab-research/largekernel3d)

\n\n[1]Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
\n[paper](https://arxiv.org/abs/2303.03667) | [code](https://github.com/JierunChen/FasterNet)

\n\n
\n\n \n\n### CNN\n\n[6]VNE: An Effective Method for Improving Deep Representation by Manipulating Eigenvalue Distribution
\n[paper](https://arxiv.org/abs/2304.01434) | [code](https://github.com/jaeill/CVPR23-VNE)

\n\n[5]Randomized Adversarial Training via Taylor Expansion
\n[paper](https://arxiv.org/abs/2303.10653) | [code](https://github.com/alexkael/randomized-adversarial-training)

\n\n[4]Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations
\n[paper](https://arxiv.org/abs/2303.08085) | [code](https://github.com/hmichaeli/alias_free_convnets)

\n\n[3]DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
\n[paper](https://arxiv.org/abs/2303.02165) | [code](https://github.com/alibaba/lightweight-neural-architecture-search)

\n\n[2]Demystify Transformers & Convolutions in Modern Image Deep Networks
\n[paper](https://arxiv.org/abs/2211.05781) | [code](https://github.com/OpenGVLab/STM-Evaluation)

\n\n[1]InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
\n[paper](https://arxiv.org/abs/2211.05778) | [code](https://github.com/OpenGVLab/InternImage)

\n\n\n
\n\n \n\n### Transformer\n\n[24]Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
\n[paper](https://arxiv.org/abs/2304.04237) | [code](https://github.com/leaplabthu/slide-transformer)

\n\n[23]METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
\n[paper](https://arxiv.org/abs/2304.02211)

\n\n[22]MethaneMapper: Spectral Absorption aware Hyperspectral Transformer for Methane Detection
\n[paper](https://arxiv.org/abs/2304.02767)

\n\n[21]Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
\n[paper](https://arxiv.org/abs/2304.03282) | [code](https://github.com/dingmyu/dependencyvit)

\n\n[20]Learning Expressive Prompting With Residuals for Vision Transformers
\n[paper](https://arxiv.org/abs/2303.15591)

\n\n[19]Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization
\n[paper](https://arxiv.org/abs/2303.15754)

\n\n[18]One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
\n[paper](https://arxiv.org/abs/2303.16160) | [code](https://github.com/IDEA-Research/OSX)

\n\n[17]Generalized Relation Modeling for Transformer Tracking
\n[paper](https://arxiv.org/abs/2303.16580) | [code](https://github.com/little-podi/grm)

\n\n[16]Learning Anchor Transformations for 3D Garment Animation
\n[paper](https://arxiv.org/abs/2304.00761)

\n\n[15]CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection
\n[paper](https://arxiv.org/abs/2301.01970) | [code](https://github.com/xiaomabufei/CAT)

\n\n[14]Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
\n[paper](https://arxiv.org/abs/2303.13755)

\n\n[13]POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery
\n[paper](https://arxiv.org/abs/2303.13357)

\n\n[12]FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER
\n[paper](https://arxiv.org/abs/2205.15448)

\n\n[11]Spherical Transformer for LiDAR-based 3D Recognition
\n[paper](https://arxiv.org/abs/2303.12766) | [code](https://github.com/dvlab-research/sphereformer)

\n\n[10]MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models
\n[paper]() | [code](https://github.com/mlvlab/MELTR)

\n\n[9]Top-Down Visual Attention from Analysis by Synthesis
\n[paper](https://arxiv.org/abs/2303.13043)

\n\n[8]BiFormer: Vision Transformer with Bi-Level Routing Attention
\n[paper](https://arxiv.org/abs/2303.08810) | [code](https://github.com/rayleizhu/biformer)

\n\n[7]Making Vision Transformers Efficient from A Token Sparsification View
\n[paper](https://arxiv.org/abs/2303.08685)

\n\n[6]Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves
\n[paper](https://arxiv.org/abs/2303.01112)

\n\n[5]Learning Imbalanced Data with Vision Transformers
\n[paper](https://arxiv.org/abs/2212.02015) | [code](https://github.com/xuzhengzhuo/livt)

\n\n[4]SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency
\n[paper](https://arxiv.org/abs/2211.02006)

\n\n[3]Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers
\n[paper](https://arxiv.org/abs/2205.12551) | [code](https://github.com/yhlleo/mjp)

\n\n[2]Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR
\n[paper](https://arxiv.org/abs/2303.07335) | [code](https://github.com/IDEA-Research/Lite-DETR)

\n\n[1]Integrally Pre-Trained Transformer Pyramid Networks
\n[paper](https://arxiv.org/abs/2211.12735) | [code](https://github.com/sunsmarterjie/iTPN)

\n\n
\n\n \n\n### 图神经网络(GNN)\n\n[4]Adversarially Robust Neural Architecture Search for Graph Neural Networks
\n[paper](https://arxiv.org/abs/2304.04168)

\n\n[3]Mind the Label Shift of Augmentation-based Graph OOD Generalization
\n[paper](https://arxiv.org/abs/2303.14859)

\n\n[2]Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks
\n[paper](https://arxiv.org/abs/2303.06199)

\n\n[1]From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm
\n[paper](https://arxiv.org/abs/2211.11761)

\n\n\n
\n\n \n\n### 神经网络架构搜索(NAS)\n\n[3]Polynomial Implicit Neural Representations For Large Diverse Datasets
\n[paper](https://arxiv.org/abs/2303.11424) | [code](https://github.com/rajhans0/poly_inr)

\n\n[2]PA&DA: Jointly Sampling PAth and DAta for Consistent NAS
\n[paper](https://arxiv.org/abs/2302.14772) | [code](https://github.com/ShunLu91/PA-DA)

\n\n[1]Stitchable Neural Networks(可缝合神经网络)
\n[paper](https://arxiv.org/abs/2302.06586) | [code](https://github.com/ziplab/SN-Net)

\n\n
\n\n \n\n### MLP\n\n[1]ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization
\n[paper](https://arxiv.org/abs/2303.11674) | [code](https://github.com/lingeringlight/aloft)

\n\n
\n\n \n\n### MAE\n\n[1]Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders\n[paper](https://arxiv.org/abs/2212.06785) | [code](https://github.com/ZrrSkywalker/I2P-MAE)

\n\n\n\n
\n\n \n\n## 数据处理(Data Processing)\n\n[1]TINC: Tree-structured Implicit Neural Compression
\n[paper](https://arxiv.org/abs/2211.06689) | [code](https://github.com/richealyoung/tinc)

\n\n
\n\n \n\n### 数据增广(Data Augmentation)\n\n\n\n\n\n
\n\n \n\n### 归一化/正则化(Batch Normalization)\n\n[2]Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling
\n[paper](https://arxiv.org/abs/2304.03937)

\n\n[1]Masked Images Are Counterfactual Samples for Robust Fine-tuning
\n[paper](https://arxiv.org/abs/2303.03052)

\n\n
\n\n \n\n### 图像聚类(Image Clustering)\n\n[2]DivClust: Controlling Diversity in Deep Clustering
\n[paper](https://arxiv.org/abs/2304.01042)

\n\n[1]On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering
\n[paper](https://arxiv.org/abs/2303.09877) | [code](https://github.com/danieltrosten/deepmvc)

\n\n
\n\n\n \n\n### 图像压缩(Image Compression)\n\n[2]Learned Image Compression with Mixed Transformer-CNN Architectures
\n[paper](https://arxiv.org/abs/2303.14978) | [code](https://github.com/jmliu206/lic_tcm)

\n\n[1]Context-Based Trit-Plane Coding for Progressive Image Compression
\n[paper](https://arxiv.org/abs/2303.05715) | [code](https://github.com/seungminjeon-github/CTC)

\n\n
\n\n \n\n## 模型训练/泛化(Model Training/Generalization)\n\n[25]Improved Test-Time Adaptation for Domain Generalization
\n[paper](https://arxiv.org/abs/2304.04494)

\n\n[24]Re-thinking Model Inversion Attacks Against Deep Neural Networks
\n[paper](https://arxiv.org/abs/2304.01669)

\n\n[23]Regularize implicit neural representation by itself
\n[paper](https://arxiv.org/abs/2303.15484)

\n\n[22]Improving the Transferability of Adversarial Samples by Path-Augmented Method
\n[paper](https://arxiv.org/abs/2303.15735)

\n\n[21]Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency
\n[paper](https://arxiv.org/abs/2303.18191) | [code](https://github.com/cgcl-codes/teco)

\n\n[20]Progressive Random Convolutions for Single Domain Generalization
\n[paper](https://arxiv.org/abs/2304.00424)

\n\n[19]Tunable Convolutions with Parametric Multi-Loss Optimization
\n[paper](https://arxiv.org/abs/2304.00898)

\n\n[18]Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm
\n[paper](https://arxiv.org/abs/2303.14382) | [code](https://github.com/yichen928/activeft)

\n\n[17]CFA: Class-wise Calibrated Fair Adversarial Training
\n[paper](https://arxiv.org/abs/2303.14460) | [code](https://github.com/pku-ml/cfa)

\n\n[16]Generalist: Decoupling Natural and Robust Generalization
\n[paper](https://arxiv.org/abs/2303.13813)

\n\n[15]Feature Separation and Recalibration for Adversarial Robustness
\n[paper](https://arxiv.org/abs/2303.13846)

\n\n[14]Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck
\n[paper](https://arxiv.org/abs/2303.14096)

\n\n[13]FlexiViT: One Model for All Patch Sizes
\n[paper](https://arxiv.org/abs/2212.08013) | [code](https://github.com/google-research/big_vision)

\n\n[12]Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization
\n[paper](https://arxiv.org/abs/2303.13087) | [code](https://github.com/zhuohuangai/sharpdro)

\n\n[11]Improving Generalization with Domain Convex Game
\n[paper](https://arxiv.org/abs/2303.13297)

\n\n[10]TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization
\n[paper](https://arxiv.org/abs/2303.11135) | [code](https://github.com/ziquanliu/cvpr2023-twins)

\n\n[9]An Extended Study of Human-like Behavior under Adversarial Training
\n[paper](https://arxiv.org/abs/2303.12669)

\n\n[8]Sharpness-Aware Gradient Matching for Domain Generalization
\n[paper](https://arxiv.org/abs/2303.10353) | [code](https://github.com/wang-pengfei/sagm)

\n\n[7]HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining
\n[paper](https://arxiv.org/abs/2303.05675)

\n\n[6]Universal Instance Perception as Object Discovery and Retrieval
\n[paper](https://arxiv.org/abs/2303.06674) | [code](https://github.com/MasterBin-IIAU/UNINEXT)

\n\n[5]Practical Network Acceleration with Tiny Sets
\n[paper]([arxiv.org](https://arxiv.org/abs/2202.07861)) | [code]([github.com](https://github.com/doctorkey/practise))

\n\n[4]Towards Bridging the Performance Gaps of Joint Energy-based Models
\n[paper](https://arxiv.org/abs/2209.07959v2) | [code](https://github.com/sndnyang/sadajem)

\n\n[3]DropKey
\n[paper](https://arxiv.org/abs/2208.02646)

\n\n[2]Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
\n[paper](https://arxiv.org/abs/2303.03108)

\n\n[1]DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
\n[paper](https://arxiv.org/abs/2302.14685)

\n\n
\n\n \n\n### 噪声标签(Noisy Label)\n\n[2]Fine-Grained Classification with Noisy Labels
\n[paper](https://arxiv.org/abs/2303.02404)

\n\n[1]Combating noisy labels in object detection datasets
\n[paper](https://arxiv.org/abs/2211.13993)

\n\n\n\n
\n\n \n\n### 长尾分布(Long-Tailed Distribution)\n\n[3]Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation
\n[paper](https://arxiv.org/abs/2304.01279) | [code](https://github.com/jinyan-06/shike)

\n\n[2]SuperDisco: Super-Class Discovery Improves Visual Recognition for the Long-Tail
\n[paper](https://arxiv.org/abs/2304.00101)

\n\n[1]Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification
\n[paper](https://arxiv.org/abs/2303.12307)

\n\n\n
\n\n \n\n\n## 图像特征提取与匹配(Image feature extraction and matching)\n\n[7]CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes
\n[paper](https://arxiv.org/abs/2303.16202)

\n\n[6]Adaptive Spot-Guided Transformer for Consistent Local Feature Matching
\n[paper](https://arxiv.org/abs/2303.16624)

\n\n[5]PMatch: Paired Masked Image Modeling for Dense Geometric Matching
\n[paper](https://arxiv.org/abs/2303.17342)

\n\n[4]Enhancing Deformable Local Features by Jointly Learning to Detect and Describe Keypoints
\n[paper](https://arxiv.org/abs/2304.00583)

\n\n[3]Referring Image Matting
\n[paper](https://arxiv.org/abs/2206.05149) | [code](https://github.com/jizhizili/rim)

\n\n[2]Iterative Geometry Encoding Volume for Stereo Matching
\n[paper](https://arxiv.org/abs/2303.06615) | [code](https://github.com/gangweix/igev)

\n\n[1]Modality-Agnostic Debiasing for Single Domain Generalization
\n[paper](https://arxiv.org/abs/2303.07123)

\n\n
\n\n \n\n## 视觉表征学习(Visual Representation Learning)\n\n[17]HNeRV: A Hybrid Neural Representation for Videos
\n[paper](https://arxiv.org/abs/2304.02633) | [code](https://github.com/haochen-rye/hnerv)

\n\n[16]Learning Rotation-Equivariant Features for Visual Correspondence
\n[paper](https://arxiv.org/abs/2303.15472)

\n\n[15]Mixed Autoencoder for Self-supervised Visual Representation Learning
\n[paper](https://arxiv.org/abs/2303.17152)

\n\n[14]Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations
\n[paper](https://arxiv.org/abs/2303.17839)

\n\n[13]Multi-Modal Representation Learning with Text-Driven Soft Masks
\n[paper](https://arxiv.org/abs/2304.00719)

\n\n[12]Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
\n[paper](https://arxiv.org/abs/2303.14191)

\n\n[11]CrOC: Cross-View Online Clustering for Dense Visual Representation Learning
\n[paper](https://arxiv.org/abs/2303.13245) | [code](https://github.com/stegmuel/croc)

\n\n[10]Masked Motion Encoding for Self-Supervised Video Representation Learning
\n[paper](https://arxiv.org/abs/2210.06096) | [code](https://github.com/xinyusun/mme)

\n\n[9]Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
\n[paper](https://arxiv.org/abs/2303.12370) | [code](https://github.com/svip-lab/weaksvr)

\n\n[8]MARLIN: Masked Autoencoder for facial video Representation LearnINg
\n[paper](https://arxiv.org/abs/2211.06627) | [code](https://github.com/ControlNet/MARLIN)

\n\n[7]Hierarchical discriminative learning improves visual representations of biomedical microscopy
\n[paper](https://arxiv.org/abs/2303.01605)

\n\n[6]Fine-tuned CLIP Models are Efficient Video Learners
\n[paper](https://arxiv.org/abs/2212.03640) | [code](https://github.com/muzairkhattak/vifi-clip)

\n\n[5]Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
\n[paper](https://arxiv.org/abs/2212.04500) | [code](https://github.com/ruiwang2021/mvd)

\n\n[4]Open-Set Representation Learning through Combinatorial Embedding
\n[paper](https://arxiv.org/abs/2106.15278\t)

\n\n[3]NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
\n[paper](https://arxiv.org/abs/2211.08024)

\n\n[2]Stare at What You See: Masked Image Modeling without Reconstruction
\n[paper](https://arxiv.org/abs/2211.08887) | [code](https://github.com/openperceptionx/maskalign)

\n\n[1]Switchable Representation Learning Framework with Self-compatibility
\n[paper](https://arxiv.org/abs/2206.08289)

\n\n
\n\n \n\n## 模型评估(Model Evaluation)\n\n[3]ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing
\n[paper](https://arxiv.org/abs/2303.17096) | [code](https://github.com/alibaba/easyrobust/tree/main/benchmarks/imagenet-e)

\n\n[2]Physically Adversarial Infrared Patches with Learnable Shapes and Locations
\n[paper](https://arxiv.org/abs/2303.13868)

\n\n[1]TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
\n[paper](https://arxiv.org/abs/2303.05762) | [code](https://github.com/chenweixin107/trojdiff)

\n\n
\n\n \n\n## 多模态学习(Multi-Modal Learning)\n\n[18]Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
\n[paper](https://arxiv.org/abs/2304.03307) | [code](https://github.com/talalwasim/vita-clip)

\n\n[17]Detecting and Grounding Multi-Modal Media Manipulation
\n[paper](https://arxiv.org/abs/2304.02556) | [code](https://github.com/rshaojimmy/multimodal-deepfake)

\n\n[16]Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
\n[paper](https://arxiv.org/abs/2304.02853)

\n\n[15]Quantum Multi-Model Fitting
\n[paper](https://arxiv.org/abs/2303.15444) | [code](https://github.com/farinamatteo/qmmf)

\n\n[14]Towards Flexible Multi-modal Document Models
\n[paper](https://arxiv.org/abs/2303.18248)

\n\n[13]CLIP2Scene: Towards Label-Efficient 3D Scene Understanding by CLIP
\n[paper]( https://arxiv.org/abs/2301.04926) | [code](https://github.com/runnanchen/CLIP2Scene)

\n\n[12]MaPLe: Multi-modal Prompt Learning
\n[paper](https://arxiv.org/abs/2210.03117) | [code](https://github.com/muzairkhattak/multimodal-prompt-learning)

\n\n[11]Decoupled Multimodal Distilling for Emotion Recognition
\n[paper](https://arxiv.org/abs/2303.13802)

\n\n[10]MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
\n[paper](https://arxiv.org/abs/2212.09478) | [code](https://github.com/researchmm/mm-diffusion)

\n\n[9]BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency
\n[paper](https://arxiv.org/abs/2303.12419) | [code](https://github.com/xu5zhao/bicro)

\n\n[8]Mutilmodal Feature Extraction and Attention-based Fusion for Emotion Estimation in Videos
\n[paper](https://arxiv.org/abs/2303.10421) | [code](https://github.com/xkwangcn/abaw-5th-rt-iai)

\n\n[7]Emotional Reaction Intensity Estimation Based on Multimodal Data
\n[paper](https://arxiv.org/abs/2303.09167)

\n\n[6]Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers
\n[paper](https://arxiv.org/abs/2303.09164)

\n\n[5]Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning
\n[paper](https://arxiv.org/abs/2303.05952)

\n\n[4]Multimodal Prompting with Missing Modalities for Visual Recognition
\n[paper](https://arxiv.org/abs/2303.03369) | [code](https://github.com/yilunlee/missing_aware_prompts)

\n\n[3]Align and Attend: Multimodal Summarization with Dual Contrastive Losses
\n[paper](https://arxiv.org/abs/2303.07284) | [code](https://boheumd.github.io/A2Summ/)

\n\n[2]Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information(通过最大化多模态互信息实现一体化预训练)
\n[paper](https://arxiv.org/abs/2211.09807) | [code](https://github.com/OpenGVLab/M3I-Pretraining)

\n\n[1]Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks(Uni-Perceiver v2：用于大规模视觉和视觉语言任务的通才模型)
\n[paper](https://arxiv.org/abs/2211.09808) | [code](https://github.com/fundamentalvision/Uni-Perceiver)

\n\n\n
\n\n \n\n### 视听学习(Audio-visual Learning)\n\n[11]Fine-grained Audible Video Description
\n[paper](https://arxiv.org/abs/2303.15616)

\n\n[10]Language-Guided Audio-Visual Source Separation via Trimodal Consistency
\n[paper](https://arxiv.org/abs/2303.16342)

\n\n[9]Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
\n[paper](https://arxiv.org/abs/2303.16897)

\n\n[8]Audio-Visual Grouping Network for Sound Localization from Mixtures
\n[paper](https://arxiv.org/abs/2303.17056) | [code](https://github.com/stonemo/avgn)

\n\n[7]Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
\n[paper](https://arxiv.org/abs/2303.17490)

\n\n[6]Egocentric Audio-Visual Object Localization
\n[paper](https://arxiv.org/abs/2303.13471) | [code](https://github.com/wikichao/ego-av-loc)

\n\n[5]Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
\n[paper](https://arxiv.org/abs/2303.11302)

\n\n[4]Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
\n[paper](https://arxiv.org/abs/2303.12930)

\n\n[3]Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
\n[paper](https://arxiv.org/abs/2303.08536) | [code](https://github.com/joannahong/av-relscore)

\n\n[2]CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective
\n[paper](https://arxiv.org/abs/2303.06357) | [code](https://arxiv.org/abs/2303.06357)

\n\n[1]A Light Weight Model for Active Speaker Detection
\n[paper](https://arxiv.org/abs/2303.04439) | [code](https://github.com/junhua-liao/light-asd)

\n\n\n
\n\n \n\n### 视觉-语言（Vision-language）\n\n[30]CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model
\n[paper](https://arxiv.org/abs/2304.04231) | [code](https://github.com/dk-liang/crowdclip)

\n\n[29]Improving Vision-and-Language Navigation by Generating Future-View Image Semantics
\n[paper](https://arxiv.org/abs/2304.04907)

\n\n[28]Learning to Name Classes for Vision and Language Models
\n[paper](https://arxiv.org/abs/2304.01830)

\n\n[27]VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision
\n[paper](https://arxiv.org/abs/2304.03135) | [code](https://github.com/lmy98129/vlpd)

\n\n[26]HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
\n[paper](https://arxiv.org/abs/2303.15786) | [code](https://github.com/artanic30/hoiclip)

\n\n[25]KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
\n[paper](https://arxiv.org/abs/2303.15796) | [code](https://github.com/xiangyangli20/kerm)

\n\n[24]PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout
\n[paper](https://arxiv.org/abs/2303.15937) | [code](https://github.com/pku-icst-mipl/posterlayout-cvpr2023)

\n\n[23]SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
\n[paper](https://arxiv.org/abs/2303.17200)

\n\n[22]VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining
\n[paper](https://arxiv.org/abs/2303.14302)

\n\n[21]Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
\n[paper](https://arxiv.org/abs/2303.14369) | [code](https://github.com/jpthu17/HBI)

\n\n[20]IFSeg: Image-free Semantic Segmentation via Vision-Language Model
\n[paper](https://arxiv.org/abs/2303.14396) | [code](https://github.com/alinlab/ifseg)

\n\n[19]Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
\n[paper](https://arxiv.org/abs/2303.14968) | [code](https://github.com/zwx8981/liqe)

\n\n[18]MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
\n[paper](https://arxiv.org/abs/2210.05335) | [code](https://github.com/iigroup/map)

\n\n[17]Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
\n[paper](https://arxiv.org/abs/2211.13437)

\n\n[16]Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
\n[paper](https://arxiv.org/abs/2301.00182) | [code](https://github.com/whwu95/BIKE)

\n\n[15]Test of Time: Instilling Video-Language Models with a Sense of Time
\n[paper](https://arxiv.org/abs/2301.02074) | [code](https://github.com/bpiyush/TestOfTime)

\n\n[14]Accelerating Vision-Language Pretraining with Free Language Modeling
\n[paper](https://arxiv.org/abs/2303.14038)

\n\n[13]Task Residual for Tuning Vision-Language Models
\n[paper](https://arxiv.org/abs/2211.10277) | [code](https://github.com/geekyutao/taskres)

\n\n[12]MAGVLT: Masked Generative Vision-and-Language Transformer
\n[paper](https://arxiv.org/abs/2303.12208)

\n\n[11]Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding
\n[paper](https://arxiv.org/abs/2303.12513) | [code](https://github.com/TAU-VAILab/isbertblind)

\n\n[10]Lana: A Language-Capable Navigator for Instruction Following and Generation
\n[paper](https://arxiv.org/abs/2303.08409) | [code](https://github.com/wxh1996/lana-vln)

\n\n[9]FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
\n[paper](https://arxiv.org/abs/2303.02483) | [code](https://github.com/brandonhanx/fame-vil)

\n\n[8]Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
\n[paper](https://arxiv.org/abs/2303.04077)

\n\n[7]Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
\n[paper](https://arxiv.org/abs/2301.04558)

\n\n[6]Connecting Vision and Language with Video Localized Narratives
\n[paper](https://arxiv.org/abs/2302.11217) | [code](https://github.com/google/video-localized-narratives)

\n\n[5]Policy Adaptation from Foundation Model Feedback
\n[paper](https://arxiv.org/abs/2212.07398)

\n\n[4]Open-vocabulary Attribute Detection
\n[paper](https://arxiv.org/abs/2211.12914)

\n\n[3]Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
\n[paper](https://arxiv.org/abs/2303.00040)

\n\n[2]Turning a CLIP Model into a Scene Text Detector
\n[paper](https://arxiv.org/abs/2302.14338) | [code](https://github.com/wenwenyu/TCM)

\n\n[1]GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods
\n[paper](https://arxiv.org/abs/2301.01893)

\n\n\n
\n \n\n## 视觉预测(Vision-based Prediction)\n\n[4]TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving
\n[paper](https://arxiv.org/abs/2303.09998)

\n\n[3]Intention-Conditioned Long-Term Human Egocentric Action Forecasting
\n[paper](https://arxiv.org/abs/2207.12080) | [code](https://github.com/Evm7/ego4dlta-icvae)

\n\n[2]Computational Choreography using Human Motion Synthesis
\n[paper](https://arxiv.org/abs/2210.04366)

\n\n[1]IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction
\n[paper](https://arxiv.org/abs/2303.00575)

\n\n\n\n
\n \n\n## 数据集(Dataset)\n\n[21]Uncurated Image-Text Datasets: Shedding Light on Demographic Bias
\n[paper](https://arxiv.org/abs/2304.02828) | [code](https://github.com/noagarcia/phase)

\n\n[20]CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
\n[paper](https://arxiv.org/abs/2303.17948)

\n\n[19]CelebV-Text: A Large-Scale Facial Text-Video Dataset
\n[paper](https://arxiv.org/abs/2303.14717) | [code](https://github.com/CelebV-Text/CelebV-Text)

\n\n[18]On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks
\n[paper](https://arxiv.org/abs/2303.14840) | [code](https://github.com/junggy/hammer-dataset)

\n\n[17]Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method
\n[paper](https://arxiv.org/abs/2303.15166) | [code](https://github.com/dreemurr-t/baid)

\n\n[16]Recovering 3D Hand Mesh Sequence from a Single Blurry Image: A New Dataset and Temporal Unfolding
\n[paper](https://arxiv.org/abs/2303.15417) | [code](https://github.com/jaehakim97/blurhand_release)

\n\n[15]GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
\n[paper](https://arxiv.org/abs/2211.05272)

\n\n[14]ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data
\n[paper](https://arxiv.org/abs/2303.13885)

\n\n[13]Fantastic Breaks: A Dataset of Paired 3D Scans of Real-World Broken Objects and Their Complete Counterparts
\n[paper](https://arxiv.org/abs/2303.14152)

\n\n[12]A Bag-of-Prototypes Representation for Dataset-Level Applications
\n[paper](https://arxiv.org/abs/2303.13251)

\n\n[11]Music-Driven Group Choreography
\n[paper](https://arxiv.org/abs/2303.12337)

\n\n[10]RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset
\n[paper](https://arxiv.org/abs/2303.12564)

\n\n[9]Backdoor Defense via Adaptively Splitting Poisoned Dataset
\n[paper](https://arxiv.org/abs/2303.12993) | [code](https://github.com/kuofenggao/asd)

\n\n[8]Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models
\n[paper](https://arxiv.org/abs/2303.13031) | [code](https://github.com/andreguo/hdrtvdm)

\n\n[7]SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in Urban Environments
\n[paper](https://arxiv.org/abs/2303.09095) | [code](https://github.com/climbingdaily/SLOPER4D)

\n\n[6]A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
\n[paper](https://arxiv.org/abs/2212.04825) | [code](https://github.com/facebookresearch/Whac-A-Mole)

\n\n[5]MVImgNet: A Large-scale Dataset of Multi-view Images
\n[paper](https://arxiv.org/abs/2303.06042)

\n\n[4]Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo
\n[paper](https://arxiv.org/abs/2303.01943)

\n\n[3]CUDA: Convolution-based Unlearnable Datasets
\n[paper](https://arxiv.org/abs/2303.04278)

\n\n[2]V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception
\n[paper](http://arxiv.org/abs/2303.07601)

\n\n[1]Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes
\n[paper](https://arxiv.org/abs/2303.02760)

\n\n\n\n
\n\n \n\n## 主动学习(Active Learning)\n\n\n\n
\n\n \n\n## 小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)\n\n[15]Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
\n[paper](https://arxiv.org/abs/2304.03119)

\n\n[14]Zero-shot Model Diagnosis
\n[paper](https://arxiv.org/abs/2303.15441)

\n\n[13]AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
\n[paper](https://arxiv.org/abs/2303.16501)

\n\n[12]Hierarchical Dense Correlation Distillation for Few-Shot Segmentation
\n[paper](https://arxiv.org/abs/2303.14652)

\n\n[11]ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection
\n[paper](https://arxiv.org/abs/2303.14679) | [code](https://github.com/casia-iva-lab/zbs)

\n\n[10]Learning Attention as Disentangler for Compositional Zero-shot Learning
\n[paper](https://arxiv.org/abs/2303.15111) | [code](https://github.com/haoosz/ade-czsl)

\n\n[9]Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
\n[paper](https://arxiv.org/abs/2303.15322) | [code](https://github.com/manliucoder/psvma)

\n\n[8]CF-Font: Content Fusion for Few-shot Font Generation
\n[paper](https://arxiv.org/abs/2303.14017)

\n\n[7]DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
\n[paper](https://arxiv.org/abs/2303.09674) | [code](https://github.com/phoenix-v/digeo)

\n\n[6]Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings
\n[paper](https://arxiv.org/abs/2303.09352) | [code](https://github.com/uitml/nohub)

\n\n[5]Bi-directional Distribution Alignment for Transductive Zero-Shot Learning
\n[paper](https://arxiv.org/abs/2303.08698) | [code](https://github.com/zhicaiwww/bi-vaegan)

\n\n[4]Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation
\n[paper](https://arxiv.org/abs/2303.01311)

\n\n[3]Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
\n[paper](https://arxiv.org/abs/2303.02151) | [code](https://github.com/zrrskywalker/cafo)

\n\n[2]NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging
\n[paper](https://arxiv.org/abs/2303.04958)

\n\n[1]FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization
\n[paper](http://arxiv.org/abs/2303.07418) | [code](https://github.com/jiawei-yang/freenerf)

\n\n
\n\n \n\n## 持续学习(Continual Learning/Life-long Learning)\n\n[7]Asynchronous Federated Continual Learning
\n[paper](https://arxiv.org/abs/2304.03626) | [code](https://github.com/lttm/fedspace)

\n\n[6]Exploring Data Geometry for Continual Learning
\n[paper](https://arxiv.org/abs/2304.03931)

\n\n[5]Task Difficulty Aware Parameter Allocation & Regularization for Lifelong Learning
\n[paper](https://arxiv.org/abs/2304.05288) | [code](https://github.com/wenjinw/par)

\n\n[4]Online Distillation with Continual Learning for Cyclic Domain Shifts
\n[paper](https://arxiv.org/abs/2304.01239)

\n\n[3]Preserving Linear Separability in Continual Learning by Backward Feature Projection
\n[paper](https://arxiv.org/abs/2303.14595)

\n\n[2]Computationally Budgeted Continual Learning: What Does Matter?
\n[paper](https://arxiv.org/abs/2303.11165) | [code](https://github.com/drimpossible/budgetcl)

\n\n[1]Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
\n[paper](https://arxiv.org/abs/2303.09483) | [code](https://github.com/kim-sanghwan/ancl)

\n\n
\n\n \n\n## 场景图(Scene Graph)\n\n[2]Devil's on the Edges: Selective Quad Attention for Scene Graph Generation
\n[paper](https://arxiv.org/abs/2304.03495)

\n\n[1]Probabilistic Debiasing of Scene Graphs
\n[paper](https://arxiv.org/abs/2211.06444) | [code](https://github.com/bashirulazam/within-triplet-debias)

\n\n
\n\n \n\n### 场景图生成(Scene Graph Generation)\n\n[1]Prototype-based Embedding Network for Scene Graph Generation
\n[paper](https://paperswithcode.com/paper/prototype-based-embedding-network-for-scene)

\n\n
\n\n \n\n### 场景图预测(Scene Graph Prediction)\n\n[1]VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
\n[paper](https://arxiv.org/abs/2303.14408) | [code](https://github.com/wz7in/cvpr2023-vlsat)

\n\n
\n\n \n\n### 场景图理解(Scene Graph Understanding)\n\n[2]SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
\n[paper](https://arxiv.org/abs/2204.11964)

\n\n[1]PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
\n[paper](https://arxiv.org/abs/2211.16312) | [code](https://github.com/cvmi-lab/pla)

\n\n
\n\n \n\n## 视觉定位/位姿估计(Visual Localization/Pose Estimation)\n\n[7]OrienterNet: Visual Localization in 2D Public Maps with Neural Matching
\n[paper](https://arxiv.org/abs/2304.02009)

\n\n[6]Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention
\n[paper](https://arxiv.org/abs/2303.15274)

\n\n[5]Human Pose as Compositional Tokens
\n[paper](https://arxiv.org/abs/2303.11638)

\n\n[4]Data-efficient Large Scale Place Recognition with Graded Similarity Supervision
\n[paper](https://arxiv.org/abs/2303.11739) | [code](https://github.com/marialeyvallina/generalized_contrastive_loss)

\n\n[3]PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers
\n[paper](https://arxiv.org/abs/2303.09187)

\n\n[2]StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition
\n[paper](https://arxiv.org/abs/2212.00937)

\n\n[1]PyramidFlow: High-Resolution Defect Contrastive Localization using Pyramid Normalizing Flow
\n[paper](https://arxiv.org/abs/2303.02595)

\n\n\n\n
\n\n \n\n## 视觉推理/视觉问答(Visual Reasoning/VQA)\n\n[9]Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
\n[paper](https://arxiv.org/abs/2304.03754)

\n\n[8]MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
\n[paper](https://arxiv.org/abs/2303.14933) | [code](https://github.com/zzc-1998/md-vqa)

\n\n[7]3D Concept Learning and Reasoning from Multi-View Images
\n[paper](https://arxiv.org/abs/2303.11327)

\n\n[6]Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices
\n[paper](https://arxiv.org/abs/2303.11730) | [code](https://github.com/xu-jingyi/algebraicmr)

\n\n[5]Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning
\n[paper](https://arxiv.org/abs/2303.10482) | [code](https://github.com/szzexpoi/poem)

\n\n[4]Generative Bias for Robust Visual Question Answering
\n[paper](https://arxiv.org/abs/2208.00690)

\n\n[3]MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
\n[paper](https://arxiv.org/abs/2303.01239) | [code](https://github.com/jingjing12110/mixphm)

\n\n[2]Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering
\n[paper](https://arxiv.org/abs/2303.01903) | [code](https://github.com/milvlg/prophet)

\n\n[1]From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
\n[paper](https://arxiv.org/abs/2212.10846) | [code](https://github.com/salesforce/lavis)

\n\n
\n\n \n\n## 图像分类(Image Classification)\n\n[4]Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments
\n[paper](https://arxiv.org/abs/2210.16046)

\n\n[3]Semantic Prompt for Few-Shot Image Recognition
\n[paper](https://arxiv.org/abs/2303.14123)

\n\n[2]Boosting Verified Training for Robust Image Classifications via Abstraction
\n[paper](https://arxiv.org/abs/2303.11552) | [code](https://github.com/zhangzhaodi233/abscert)

\n\n[1]I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification(I2MVFormer：用于零样本图像分类的大型语言模型生成的多视图文档监督)
\n[paper](https://arxiv.org/abs/2212.02291)

\n\n
\n\n \n\n## 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)\n\n[17]DATE: Domain Adaptive Product Seeker for E-commerce
\n[paper](https://arxiv.org/abs/2304.03669)

\n\n[16]Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer
\n[paper](https://arxiv.org/abs/2304.04461)

\n\n[15]GeoNet: Benchmarking Unsupervised Adaptation across Geographies
\n[paper](https://arxiv.org/abs/2303.15443)

\n\n[14]C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation
\n[paper](https://arxiv.org/abs/2303.17132)

\n\n[13]AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation
\n[paper](https://arxiv.org/abs/2304.01110) | [code](https://github.com/gzaraunitn/autolabel)

\n\n[12]BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
\n[paper](https://arxiv.org/abs/2303.14773) | [code](https://github.com/changdaeoh/blackvip)

\n\n[11]Deep Frequency Filtering for Domain Generalization
\n[paper](https://arxiv.org/abs/2203.12198)

\n\n[10]Semi-Supervised Domain Adaptation with Source Label Adaptation
\n[paper](https://arxiv.org/abs/2302.02335) | [code](https://github.com/chu0802/sla)

\n\n[9]Unsupervised Continual Semantic Adaptation through Neural Rendering
\n[paper](https://arxiv.org/abs/2211.13969)

\n\n[8]MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
\n[paper](https://arxiv.org/abs/2212.01322) | [code](https://github.com/lhoyer/mic)

\n\n[7]Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective
\n[paper](https://arxiv.org/abs/2303.13434)

\n\n[6]Manipulating Transfer Learning for Property Inference
\n[paper](https://arxiv.org/abs/2303.11643) | [code](https://github.com/yulongt23/transfer-inference)

\n\n[5]Trainable Projected Gradient Method for Robust Fine-tuning
\n[paper](https://arxiv.org/abs/2303.10720)

\n\n[4]DA-DETR: Domain Adaptive Detection Transformer with Information Fusion
\n[paper](https://arxiv.org/abs/2103.17084)

\n\n[3]Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection
\n[paper](https://arxiv.org/abs/2203.15793) | [code](https://github.com/vibashan/irg-sfda)

\n\n[2]Guiding Pseudo-labels with Uncertainty Estimation for Source-free Unsupervised Domain Adaptation
\n[paper](https://arxiv.org/abs/2303.03770) | [code](https://github.com/mattialitrico/guiding-pseudo-labels-with-uncertainty-estimation-for-source-free-unsupervised-domain-adaptation)

\n\n[1]Adaptive Assignment for Geometry Aware Local Feature Matching
\n[paper](https://arxiv.org/abs/2207.08427)

\n\n
\n\n \n\n## 度量学习(Metric Learning)\n\n\n\n
\n\n \n\n## 对比学习(Contrastive Learning)\n\n[11]FEND: A Future Enhanced Distribution-Aware Contrastive Learning Framework for Long-tail Trajectory Prediction
\n[paper](https://arxiv.org/abs/2303.16574)

\n\n[10]Dynamic Conceptional Contrastive Learning for Generalized Category Discovery
\n[paper](https://arxiv.org/abs/2303.17393) | [code](https://github.com/tpcd/dccl)

\n\n[9]Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
\n[paper](https://arxiv.org/abs/2303.14865)

\n\n[8]PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery
\n[paper](https://arxiv.org/abs/2212.05590) | [code](https://github.com/sheng-eatamath/promptcal)

\n\n[7]Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data
\n[paper](https://arxiv.org/abs/2303.14080)

\n\n[6]Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
\n[paper](https://arxiv.org/abs/2301.05709)

\n\n[5]Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation
\n[paper](https://arxiv.org/abs/2303.12112) | [code](https://github.com/aimagelab/pacscore)

\n\n[4]MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset
\n[paper](https://arxiv.org/abs/2303.12756) | [code](https://github.com/MrChenFeng/MaskCon_CVPR2023)

\n\n[3]CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
\n[paper](https://arxiv.org/abs/2303.12793) | [code](https://github.com/FangyunWei/SLRT)

\n\n[2]Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation
\n[paper](https://arxiv.org/abs/2303.10323) | [code](https://github.com/mlii0117/dcl)

\n\n[1]Twin Contrastive Learning with Noisy Labels
\n[paper](https://arxiv.org/abs/2303.06930) | [code](https://github.com/Hzzone/TCL)

\n\n
\n\n \n\n## 增量学习(Incremental Learning)\n\n[5]PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning
\n[paper](https://arxiv.org/abs/2304.04408)

\n\n[4]On the Stability-Plasticity Dilemma of Class-Incremental Learning
\n[paper](https://arxiv.org/abs/2304.01663)

\n\n[3]Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning
\n[paper](https://arxiv.org/abs/2304.00426) | [code](https://github.com/zysong0113/savc)

\n\n[2]Class-Incremental Exemplar Compression for Class-Incremental Learning
\n[paper](https://arxiv.org/abs/2303.14042)

\n\n[1]Dense Network Expansion for Class Incremental Learning
\n[paper](https://arxiv.org/abs/2303.12696)

\n\n
\n\n \n\n## 强化学习(Reinforcement Learning)\n\n[4]Reinforcement Learning-Based Black-Box Model Inversion Attacks
\n[paper](https://arxiv.org/abs/2304.04625)

\n\n[3]Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
\n[paper](https://arxiv.org/abs/2301.10034) | [code](https://github.com/craftjarvis/mc-controller)

\n\n[2]ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals
\n[paper](https://arxiv.org/abs/2303.12071)

\n\n[1]EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning
\n[paper](https://arxiv.org/abs/2303.10876) | [code](https://github.com/mediabrain-sjtu/eqmotion)

\n\n
\n\n \n\n## 元学习(Meta Learning)\n\n[4]Meta-causal Learning for Single Domain Generalization
\n[paper](https://arxiv.org/abs/2304.03709)

\n\n[3]Meta Compositional Referring Expression Segmentation
\n[paper](https://arxiv.org/abs/2304.04415)

\n\n[2]Meta-Learning with a Geometry-Adaptive Preconditioner
\n[paper](https://arxiv.org/abs/2304.01552) | [code](https://github.com/suhyun777/cvpr23-gap)

\n\n[1]A Meta-Learning Approach to Predicting Performance and Data Requirements
\n[paper](https://arxiv.org/abs/2303.01598)

\n\n
\n\n \n\n## 机器人(Robotic)\n\n[2]Efficient Map Sparsification Based on 2D and 3D Discretized Grids
\n[paper](https://arxiv.org/abs/2303.10882)

\n\n[1]PyPose: A Library for Robot Learning with Physics-based Optimization(PyPose：基于物理优化的机器人学习库)
\n[paper](https://arxiv.org/abs/2209.15428) | [code](https://pypose.org/)

\n\n\n
\n\n \n\n## 半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)\n\n[29]Weakly supervised segmentation with point annotations for histopathology images via contrast-based variational model
\n[paper](https://arxiv.org/abs/2304.03572)

\n\n[28]Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
\n[paper](https://arxiv.org/abs/2304.04175)

\n\n[27]SOOD: Towards Semi-Supervised Oriented Object Detection
\n[paper](https://arxiv.org/abs/2304.04515) | [code](https://github.com/hamperdredes/sood)

\n\n[26]Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning
\n[paper](https://arxiv.org/abs/2304.01482) | [code](https://github.com/ucdvision/patchsearch)

\n\n[25]Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks
\n[paper](https://arxiv.org/abs/2303.17602) | [code](https://github.com/tinyvision/SOLIDER)

\n\n[24]Siamese DETR
\n[paper](https://arxiv.org/abs/2303.18144)

\n\n[23]HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions
\n[paper](https://arxiv.org/abs/2304.00387)

\n\n[22]Detecting Backdoors in Pre-trained Encoders
\n[paper](https://arxiv.org/abs/2303.15180) | [code](https://github.com/giantseaweed/decree)

\n\n[21]Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
\n[paper](https://arxiv.org/abs/2201.07513)

\n\n[20]Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation
\n[paper](https://arxiv.org/abs/2303.01276) | [code](https://github.com/xiaoyao3302/CCVC)

\n\n[19]ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning
\n[paper](https://arxiv.org/abs/2303.13556)

\n\n[18]Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels
\n[paper](https://arxiv.org/abs/2303.13223)

\n\n[17]Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching
\n[paper](https://arxiv.org/abs/2303.10971) | [code](https://github.com/dongliangcao/Self-Supervised-Multimodal-Shape-Matching)

\n\n[16]Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data
\n[paper](https://arxiv.org/abs/2303.11066)

\n\n[15]Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning
\n[paper](https://arxiv.org/abs/2303.11101)

\n\n[14]Correlational Image Modeling for Self-Supervised Visual Pre-Training
\n[paper](https://arxiv.org/abs/2303.12670)

\n\n[13]Extracting Class Activation Maps from Non-Discriminative Features as well
\n[paper](https://arxiv.org/abs/2303.10334) | [code](https://github.com/zhaozhengchen/lpcam)

\n\n[12]TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation
\n[paper](https://arxiv.org/abs/2303.09870) | [code](https://github.com/devavrattomar/tesla)

\n\n[11]LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding
\n[paper](https://arxiv.org/abs/2303.09665)

\n\n[10]MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection
\n[paper](https://arxiv.org/abs/2303.09061) | [code](https://github.com/lliuz/mixteacher)

\n\n[9]Semi-supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination
\n[paper](https://arxiv.org/abs/2303.06380)

\n\n[8]Non-Contrastive Unsupervised Learning of Physiological Signals from Video
\n[paper](https://arxiv.org/abs/2303.07944)

\n\n[7]Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems
\n[paper](https://arxiv.org/abs/2303.01669) | [code](https://github.com/ganperf/lcr)

\n\n[6]Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models
\n[paper](https://arxiv.org/abs/2303.01869)

\n\n[5]The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
\n[paper]([arxiv.org](https://arxiv.org/abs/2205.12502)) | [code](https://github.com/gicheonkang/gst-visdial)

\n\n[4]Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning
\n[paper](https://arxiv.org/abs/2303.06870) | [code](https://github.com/megvii-research/US3L-CVPR2023)

\n\n[3]Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
\n[paper](https://arxiv.org/abs/2302.14746)

\n\n[2]Siamese Image Modeling for Self-Supervised Vision Representation Learning
\n[paper](https://arxiv.org/abs/2206.01204) | [code](https://github.com/fundamentalvision/Siamese-Image-Modeling)

\n\n[1]Cut and Learn for Unsupervised Object Detection and Instance Segmentation
\n[paper](https://arxiv.org/abs/2301.11320) | [project](http://people.eecs.berkeley.edu/~xdwang/projects/CutLER/)

\n\n
\n\n \n\n## 神经网络可解释性(Neural Network Interpretability)\n\n[6]Gradient-based Uncertainty Attribution for Explainable Bayesian Deep Learning
\n[paper](https://arxiv.org/abs/2304.04824)

\n\n[5]Are Data-driven Explanations Robust against Out-of-distribution Data?
\n[paper](https://arxiv.org/abs/2303.16390)

\n\n[4]IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients
\n[paper](https://arxiv.org/abs/2303.14242) | [code](https://github.com/yangruo1226/idgi)

\n\n[3]OCTET: Object-aware Counterfactual Explanations
\n[paper](https://arxiv.org/abs/2211.12380) | [code](https://github.com/valeoai/octet)

\n\n[2]Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis
\n[paper](https://arxiv.org/abs/2202.07728)

\n\n[1]SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries(SplineCam：深度网络几何和决策边界的精确可视化和表征)
\n[paper](https://arxiv.org/abs/2302.12828) | [code](https://github.com/AhmedImtiazPrio/SplineCAM)

\n\n
\n\n \n\n\n## 图像计数(Image Counting)\n\n[2]Density Map Distillation for Incremental Object Counting
\n[paper](https://arxiv.org/abs/2304.05255)

\n\n[1]Zero-shot Object Counting
\n[paper](https://arxiv.org/abs/2303.02001)

\n\n
\n\n \n\n\n## 联邦学习(Federated Learning)\n\n[4]The Resource Problem of Using Linear Layer Leakage Attack in Federated Learning
\n[paper](https://arxiv.org/abs/2303.14868)

\n\n[3]Make Landscape Flatter in Differentially Private Federated Learning
\n[paper](https://arxiv.org/abs/2303.11242)

\n\n[2]STDLens: Model Hijacking-resilient Federated Learning for Object Detection
\n[paper](https://arxiv.org/abs/2303.11511) | [code](https://github.com/git-disl/stdlens)

\n\n[1]Re-thinking Federated Active Learning based on Inter-class Diversity
\n[paper](https://arxiv.org/abs/2303.12317) | [code](https://github.com/raymin0223/logo)

\n\n
\n\n \n\n\n## 自动驾驶(automatic driving)\n\n\n[1]BEVFormer v2: Adapting Modern Image Backbones to Bird’s-Eye-View Recognition via Perspective Supervision(BEVFormer v2：通过透视监督使现代图像主干适应鸟瞰图识别)
\n[paper](https://arxiv.org/abs/2211.10439)

\n\n
\n\n \n\n## 其他\n\n[74]Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification
\n[paper](https://arxiv.org/abs/2304.01804) | [code](https://github.com/youngwk/bridgegapexplanationpamc)

\n\n[73]Knowledge Combination to Learn Rotated Detection Without Rotated Annotation
\n[paper](https://arxiv.org/abs/2304.02199)

\n\n[72]CloSET: Modeling Clothed Humans on Continuous Surface with Explicit Template Decomposition
\n[paper](https://arxiv.org/abs/2304.03167)

\n\n[71]DC2: Dual-Camera Defocus Control by Learning to Refocus
\n[paper](https://arxiv.org/abs/2304.03285)

\n\n[70]Scalable, Detailed and Mask-Free Universal Photometric Stereo
\n[paper](https://arxiv.org/abs/2303.15724) | [code](https://github.com/satoshi-ikehata/sdm-unips-cvpr2023)

\n\n[69]DiffCollage: Parallel Generation of Large Content with Diffusion Models
\n[paper](https://arxiv.org/abs/2303.17076)

\n\n[68]Why is the winner the best?
\n[paper](https://arxiv.org/abs/2303.17719)

\n\n[67]UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning
\n[paper](https://arxiv.org/abs/2304.00464)

\n\n[66]HypLiLoc: Towards Effective LiDAR Pose Regression with Hyperbolic Fusion
\n[paper](https://arxiv.org/abs/2304.00932) | [code](https://github.com/sijieaaa/hypliloc)

\n\n[65]Neural Volumetric Memory for Visual Locomotion Control
\n[paper](https://arxiv.org/abs/2304.01201)

\n\n[64]DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality
\n[paper](https://arxiv.org/abs/2303.14585) | [code](https://github.com/yizhiwang96/deepvecfont-v2)

\n\n[63]PDPP:Projected Diffusion for Procedure Planning in Instructional Videos
\n[paper](https://arxiv.org/abs/2303.14676)

\n\n[62]Disentangling Writer and Character Styles for Handwriting Generation
\n[paper](https://arxiv.org/abs/2303.14736) | [code](https://github.com/dailenson/sdt)

\n\n[61]Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation
\n[paper](https://arxiv.org/abs/2303.14926)

\n\n[60]DANI-Net: Uncalibrated Photometric Stereo by Differentiable Shadow Handling, Anisotropic Reflectance Modeling, and Neural Inverse Rendering
\n[paper](https://arxiv.org/abs/2303.15101) | [code](https://github.com/lmozart/cvpr2023-dani-net)

\n\n[59]Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph
\n[paper](https://arxiv.org/abs/2303.15266) | [code](https://github.com/zhourixin/bronze-ding)

\n\n[58]Handwritten Text Generation from Visual Archetypes
\n[paper](https://arxiv.org/abs/2303.15269) | [code](https://github.com/aimagelab/vatr)

\n\n[57]Level-S2fM: Structure from Motion on Neural Level Set of Implicit Surfaces
\n[paper](https://arxiv.org/abs/2211.12018)

\n\n[56]FeatureBooster: Boosting Feature Descriptors with a Lightweight Neural Network
\n[paper](https://arxiv.org/abs/2211.15069)

\n\n[55]ARO-Net: Learning Implicit Fields from Anchored Radial Observations
\n[paper](https://arxiv.org/abs/2212.10275) | [code](https://github.com/yizhiwang96/aro-net)

\n\n[54]Unknown Sniffer for Object Detection: Don't Turn a Blind Eye to Unknown Objects
\n[paper](https://arxiv.org/abs/2303.13769)

\n\n[53]Robust Test-Time Adaptation in Dynamic Scenarios
\n[paper](https://arxiv.org/abs/2303.13899)

\n\n[52]LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction
\n[paper](https://arxiv.org/abs/2207.10660)

\n\n[51]Doubly Right Object Recognition: A Why Prompt for Visual Rationales
\n[paper](https://arxiv.org/abs/2212.06202)

\n\n[50]CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
\n[paper](https://arxiv.org/abs/2303.13076)

\n\n[49]Marching-Primitives: Shape Abstraction from Signed Distance Function
\n[paper](https://arxiv.org/abs/2303.13190)

\n\n[48]Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery
\n[paper](https://arxiv.org/abs/2210.03591) | [code](https://github.com/fanzhichen/ncd-iic)

\n\n[47]ActMAD: Activation Matching to Align Distributions for Test-Time-Training
\n[paper](https://arxiv.org/abs/2211.12870) | [code](https://github.com/jmiemirza/actmad)

\n\n[46]Robust Mean Teacher for Continual and Gradual Test-Time Adaptation
\n[paper](https://arxiv.org/abs/2211.13081) | [code](https://github.com/mariodoebler/test-time-adaptation)

\n\n[45]Planning-oriented Autonomous Driving
\n[paper](https://arxiv.org/abs/2212.10156) | [code](https://github.com/opendrivelab/uniad)

\n\n[44]Explicit Visual Prompting for Low-Level Structure Segmentations
\n[paper](https://arxiv.org/abs/2303.10883) | [code](https://github.com/nifangbaage/explicit-visual-prompt)

\n\n[43]Leapfrog Diffusion Model for Stochastic Trajectory Prediction
\n[paper](https://arxiv.org/abs/2303.10895) | [code](https://github.com/mediabrain-sjtu/led)

\n\n[42]Feature Alignment and Uniformity for Test Time Adaptation
\n[paper](https://arxiv.org/abs/2303.10902)

\n\n[41]Attribute-preserving Face Dataset Anonymization via Latent Code Optimization
\n[paper](https://arxiv.org/abs/2303.11296) | [code](https://github.com/chi0tzp/falco)

\n\n[40]Fix the Noise: Disentangling Source Feature for Controllable Domain Translation
\n[paper](https://arxiv.org/abs/2303.11545) | [code](https://github.com/LeeDongYeun/FixNoise)

\n\n[39]Effective Ambiguity Attack Against Passport-based DNN Intellectual Property Protection Schemes through Fully Connected Layer Substitution
\n[paper](https://arxiv.org/abs/2303.11595)

\n\n[38]Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark
\n[paper](https://arxiv.org/abs/2303.11642) | [code](https://github.com/myniuuu/vcsd)

\n\n[37]Learning a Depth Covariance Function
\n[paper](https://arxiv.org/abs/2303.12157)

\n\n[36]VecFontSDF: Learning to Reconstruct and Synthesize High-quality Vector Fonts via Signed Distance Functions
\n[paper](https://arxiv.org/abs/2303.12675)

\n\n[35]Dense Distinct Query for End-to-End Object Detection
\n[paper](https://arxiv.org/abs/2303.12776) | [code](https://github.com/jshilong/ddq)

\n\n[34]Facial Affective Analysis based on MAE and Multi-modal Information for 5th ABAW Competition
\n[paper](https://arxiv.org/abs/2303.10849)

\n\n[33]Partial Network Cloning
\n[paper](https://arxiv.org/abs/2303.10597) | [code](https://github.com/jngwenye/pncloning)

\n\n[32]Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection
\n[paper](https://arxiv.org/abs/2303.10449) | [code](https://github.com/lufan31/et-ood)

\n\n[31]Adversarial Counterfactual Visual Explanations
\n[paper](https://arxiv.org/abs/2303.09962) | [code](https://github.com/guillaumejs2403/ace)

\n\n[3-]A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation
\n[paper](https://arxiv.org/abs/2303.09165) | [code](https://github.com/huitangtang/on_the_utility_of_synthetic_data)

\n\n[29]Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
\n[paper](https://arxiv.org/abs/2303.09119) | [code](https://github.com/advocate99/diffgesture)

\n\n[28]Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry
\n[paper](https://arxiv.org/abs/2303.08658) | [code](https://github.com/kebii/r2et)

\n\n[27]Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
\n[paper](https://arxiv.org/abs/2202.04235) | [code](https://github.com/twweeb/composite-adv)

\n\n[26]Backdoor Defense via Deconfounded Representation Learning
\n[paper](https://arxiv.org/abs/2303.06818) | [code](https://github.com/zaixizhang/cbd)

\n\n[25]Label Information Bottleneck for Label Enhancement
\n[paper](https://arxiv.org/abs/2303.06836)

\n\n[24]LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
\n[paper](https://arxiv.org/abs/2303.08137) | [code](https://github.com/CyberAgentAILab/layout-dm)

\n\n[23]Diversity-Aware Meta Visual Prompting
\n[paper](https://arxiv.org/abs/2303.08138) | [code](https://github.com/shikiw/dam-vp)

\n\n[22]ABAW: Valence-Arousal Estimation, Expression Recognition, Action Unit Detection & Emotional Reaction Intensity Estimation Challenges
\n[paper](https://arxiv.org/abs/2303.01498)

\n\n[21]Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
\n[paper](https://arxiv.org/abs/2303.01788)

\n\n[20]UniHCP: A Unified Model for Human-Centric Perceptions
\n[paper](https://arxiv.org/abs/2303.02936) | [code](https://github.com/opengvlab/unihcp)

\n\n[19]Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
\n[paper](https://arxiv.org/abs/2303.04249)

\n\n[18]Revisiting Rotation Averaging: Uncertainties and Robust Losses
\n[paper](https://arxiv.org/abs/2303.05195) | [code](https://github.com/zhangganlin/globalsfmpy)

\n\n[17]3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
\n[paper](https://arxiv.org/abs/2212.00338)

\n\n[16]Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection
\n[paper](https://arxiv.org/abs/2211.06368) | [code](https://github.com/open-mmlab/mmrotate)

\n\n[15]Understanding and Improving Visual Prompting: A Label-Mapping Perspective
\n[paper](https://arxiv.org/abs/2211.11635) | [code](https://github.com/optml-group/ilm-vp)

\n\n[14]vMAP: Vectorised Object Mapping for Neural Field SLAM
\n[paper](http://arxiv.org/abs/2302.01838) | [code](https://github.com/kxhit/vMAP)

\n\n[13]EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization
\n[paper](https://arxiv.org/abs/2303.01904)

\n\n[12]Upcycling Models under Domain and Category Shift
\n[paper](https://arxiv.org/abs/2303.07110) | [code](https://github.com/ispc-lab/GLC)

\n\n[11]Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images
\n[paper](https://arxiv.org/abs/2303.06873) | [code](https://github.com/HHHedo/IBMIL)

\n\n[10]Dynamic Neural Network for Multi-Task Learning Searching across Diverse Network Topologies
\n[paper](https://arxiv.org/abs/2303.06856)

\n\n[9]Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples
\n[paper](https://arxiv.org/abs/2301.01217) | [code](https://github.com/jiamingzhang94/Unlearnable-Clusters)

\n\n[8]Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
\n[paper](https://arxiv.org/abs/2303.00914)

\n\n[7]Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
\n[paper](https://arxiv.org/abs/2303.00914)

\n\n[6]Physical-World Optical Adversarial Attacks on 3D Face Recognition
\n[paper](https://arxiv.org/abs/2205.13412)

\n\n[5]Improving Cross-Modal Retrieval with Set of Diverse Embeddings
\n[paper](https://arxiv.org/abs/2211.16761)

\n\n[4]Neural Video Compression with Diverse Contexts
\n[paper](https://arxiv.org/abs/2302.14402) | [code](https://github.com/microsoft/DCVC)

\n\n[3]Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
\n[paper](https://arxiv.org/abs/2302.14677)

\n\n[2]Single Image Backdoor Inversion via Robust Smoothed Classifiers
\n[paper](https://arxiv.org/abs/2303.00215) | [code](https://github.com/locuslab/smoothinv)

\n\n[1]Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
\n[paper](https://arxiv.org/abs/2303.00462) | [code](https://github.com/Toytiny/CMFlow)

\n\n
\n\n
\n\n \n\n\n\n# 3. CVPR2023 论文解读汇总\n\n1.[CVPR2023｜打破对MIM（掩码图像建模）的数据缩放能力的误解！]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638753&idx=2&sn=755ecd0052718db6c02f49f50e5934ef&chksm=ec123318db65ba0edc63d9a3c479ee1763300fd612761458a11f993dcb11517671041f141b88&token=1182786253&lang=zh_CN#rd))

\n\n2.[CVPR 2023｜基于CLIP的微调新范式！训练速度和性能均创新高！]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638629&idx=2&sn=4f24bbaf13968e9d0cabf4a984bd7bf4&chksm=ec12339cdb65ba8a45bd3fe6deb6aae9b5e9b53e7ac7683f462412a114e673e420b04effb99e&token=1182786253&lang=zh_CN#rd))

\n\n3.[CVPR 2023｜浙大提出全归一化流模型PyramidFlow：高分辨率缺陷异常定位新范式]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638363&idx=1&sn=c1159fc20788f8087fc513d450d33a3f&chksm=ec1230a2db65b9b46a5b1a826d4f3d3743d1894394fe8c010c40773bd5a14ccf6abeaf80bdd3&token=1182786253&lang=zh_CN#rd))

\n\n4.[CVPR 2023｜大脑视觉信号被Stable Diffusion复现图像！“人类的谋略和谎言不存在了”]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638334&idx=1&sn=e1cfe285aa51210de985ca3c82763d50&chksm=ec1230c7db65b9d10420dc3d332b9ec46a69922b7771b29238c2953ae018d7bca301bf6a8986&token=1182786253&lang=zh_CN#rd))

\n\n5.[CVPR 2023｜港科大 DA-BEV: 3D目标检测新 SOTA，一种强大的深度信息挖掘方法]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638334&idx=2&sn=ceb12a2dcad0df502d72ab7c211e199f&chksm=ec1230c7db65b9d1e0e1fb43cf1d6c87c0adcabbac7fef1ed31277ee133db696722103709ba5&token=1182786253&lang=zh_CN#rd))

\n\n6.[CVPR 23｜表征学习超MAE，谷歌等提出MAGE：无监督图像生成超越 Latent Diffusion]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638272&idx=1&sn=454898151d1489a6b668302f8087dd20&chksm=ec1230f9db65b9ef5535a6ecd6b683768dcadca752c074c6ea2d1768106600d3c6b19d07d2f3&token=1182786253&lang=zh_CN#rd))

\n\n7.[CVPR2023｜不好意思我要加速度了！FasterNet：更高FLOPS才是更快更强的底气]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638140&idx=1&sn=ef59001e77950b1333111f6686a2bf71&chksm=ec123185db65b89334d21dbd993240bbe15d7d5f01460d57e937dfdc6e4032dadfb90a15586a&token=1182786253&lang=zh_CN#rd))

\n\n8.[CVPR 2023｜大模型流行之下，SN-Net给出一份独特的答卷]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638044&idx=1&sn=c0c8117a6ab456e7344479b2f9452f9d&chksm=ec1231e5db65b8f375fc2f21cac46f23db6431b66c42a6d405fe7a63aab1e14655c0f56f55a6&token=1182786253&lang=zh_CN#rd))

\n\n9.[CVPR 2023｜结合特征金字塔结构的自监督学习 iTPNs]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638044&idx=3&sn=9ff7c2f74531c958c100d5aa4deeb89d&chksm=ec1231e5db65b8f360c97314e545cad2e98a706a3bf78dcc99f56fc677d5ce45aa2f41805d22&token=1182786253&lang=zh_CN#rd))

\n\n10.[CVPR 2023｜SQR：对于训练DETR-family目标检测的探索和思考]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247637856&idx=1&sn=e72e7c78a5efb8be20880ae6351d8595&chksm=ec123699db65bf8f964bece49f549d6879f92c60976b4ab28ce5d4e63dc89b865fe36b8be62b&token=1182786253&lang=zh_CN#rd))

\n\n11.[CVPR 2023｜COCO新纪录65.4mAP！InternImage：注入新机制，扩展DCNv3，探索视觉大模型]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247637856&idx=2&sn=9ee6fbd7e4d7126a12692d9b0827a4f2&chksm=ec123699db65bf8f4519f6b48fc663dfafc9f6b9b42312cd0ae700db66888598524b36283b24&token=1182786253&lang=zh_CN#rd))

\n\n12.[CVPR 2023｜YOLOv7强势收录！时隔6年，YOLOv系列再登CVPR！]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247637752&idx=1&sn=44f42365e781ee941a3470ea875d0f13&chksm=ec123701db65be176c3a6d783030e2565613ef3199753001df509e035a6cd813b4a57612b9af&token=1182786253&lang=zh_CN#rd))

\n\n13.[CVPR 2023｜谷歌提出Imagic：扩散模型只用文字就能PS照片了！]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247637752&idx=3&sn=f9d479791e4b392b0790e5b5a0a5e5ef&chksm=ec123701db65be178f96580cfa74f58ff965dbcdbf274aa947c7b717c67e0b416c47322673a0&token=1182786253&lang=zh_CN#rd))

\n\n14.[CVPR 2023｜Lite DETR：计算量减少60%！高效交错多尺度编码器]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639326&idx=3&sn=dd995e387abc37ede3794b39f92cb59d&chksm=ec124ce7db65c5f132c34a3ebfd0038b1532ab0418444dc46766c82724599c53b220bb0b6428&token=693221699&lang=zh_CN#rd))

\n\n15.[CVPR 2023｜白翔团队新作：借助CLIP完成场景文字检测]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639207&idx=2&sn=90efb1cf701cf8fe09e1ba8711430a2a&chksm=ec124d5edb65c448da904439aa89c8f78d9f2acc6f03df94bdb983574dad58597dc0ca73345f&token=693221699&lang=zh_CN#rd))

\n\n16.[CVPR'23｜即插即用系列！一种轻量高效的自注意力机制助力图像恢复网络问鼎 SOTA]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639153&idx=1&sn=cab1c897df12dd95d32f9fbb041fb727&chksm=ec124d88db65c49ef92f615aa0d06d735629cb80726ed57b98da38ebcf957b2d5d0b8e315c52&token=693221699&lang=zh_CN#rd))

\n\n17.[CVPR 2023｜英伟达提出VoxFromer: 单目3D语义场景补全新SOTA]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639124&idx=2&sn=cfcffd7905063c14ccf74b1c952f8fe4&chksm=ec124daddb65c4bb5bad6bddb11b023716b5da38125af15531560ab96d0568f79a131112a95c&token=693221699&lang=zh_CN#rd))

\n\n18.[CVPR 2023｜EMA-VFI: 基于帧间注意力提取运动和外观信息的高效视频插帧]([url](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247638921&idx=2&sn=0bb7f63c6cf4e58893e0d118fb34ece6&chksm=ec123270db65bb6645e82632228a959f67d776dd7c3ac6d7082e313b81b2588787d40625bf92&token=693221699&lang=zh_CN#rd))

\n\n19.[CVPR 2023｜Point-NN：首次实现0参数量、0训练的3D点云分析](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247641114&idx=2&sn=8ea18574b2f792514c416af5453a5374&chksm=ec1245e3db65ccf56a7a722552885e648a234162ce79ab16e9133595e949dcbe4f5743c03e84&token=1084531620&lang=zh_CN#rd)\n\n20.[CVPR 2023｜Prophet: 用小模型启发大语言模型解决外部知识图像问答](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247641011&idx=2&sn=7bb881a16453a6cac32a1bed9c0d3775&chksm=ec124a4adb65c35c0030feff129574ccbd4f39bf75080f2fdd3e2280c36b5668e987b1f3b291&token=1084531620&lang=zh_CN#rd)\n\n21.[CVPR23｜港中文和IDEA联合推出首个大规模全场景人体数据集Human-Art](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640922&idx=3&sn=dd1099fd591d6ff56d8a7d6ec66e5839&chksm=ec124aa3db65c3b5e39d6701ddbc4f798ef5fce0b1dd62e778c0808f985c8b4de1df7b75ba25&token=1084531620&lang=zh_CN#rd)\n\n22.[CVPR 2023 Highlight｜PDPP：基于扩散模型的教学视频过程规划](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640696&idx=3&sn=256219fc6b3b5845b4b092d511bb17b4&chksm=ec124b81db65c297af8b91b1a5d2e8815ae0bb40393aa9338ca90b3e8ad4c5510d4ba433089c&token=1084531620&lang=zh_CN#rd)\n\n23.[CVPR'23｜训练出首个十亿参数量视频自监督大模型！VideoMAE V2: 可扩展的视频基础模型预训练范式](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640600&idx=1&sn=fe0ef4f404a050824173983f2b8fe9a7&chksm=ec124be1db65c2f7bb03cefa7a5bd3a6530c42fabd08712c17af080222d6bf69460d20ea34a5&token=1084531620&lang=zh_CN#rd)\n\n24.[CVPR2023｜基于视觉提示器微调的多模态单目标跟踪算法](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640600&idx=2&sn=5635575dec28e369a8855e3a4cc28b67&chksm=ec124be1db65c2f7cfb17c9c498af9ae76b0399d50ba8b4e4b59b6d079e97f89fa7ccc1b7475&token=1084531620&lang=zh_CN#rd)\n\n25.[CVPR 2023｜全新基于消费者移动设备采集的多样性RGB-D目标跟踪数据集](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640491&idx=2&sn=cd727849e8e61b69ab4369f0f16c2f6c&chksm=ec124852db65c1446fae280b41362202332440d7e7c16ea4e4dd5f8cd01f8f60dcebbf42153b&token=1084531620&lang=zh_CN#rd)\n\n26.[CVPR2023部署Trick｜解决量化误差振荡问题，让MobileNetv2的能力超过ResNet家族](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640425&idx=2&sn=2ec2cb9622858992c8cfcde3e735ce6b&chksm=ec124890db65c186cedbc4a6eb2bcc548e1c3a9718eb0ad780e628c17570d9a5768e2d8a4681&token=1084531620&lang=zh_CN#rd)\n\n27.[CVPR 2023｜IGEV-Stereo & IGEV-MVS：双目立体匹配网络新SOTA!](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640273&idx=2&sn=791533baeabc5216c15998ffd51a2c03&chksm=ec124928db65c03e2bbe32985148f48cfb8737e270f41cac0aa3b687256bf6f5ee9287e40dea&token=1084531620&lang=zh_CN#rd)\n\n28.[CVPR 2023｜UniMatch: 重新审视半监督语义分割中的强弱一致性](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640147&idx=1&sn=b20c20f1dc4db3e13184c61772916eaa&chksm=ec1249aadb65c0bcdc4829c3bba60bd0a2d5e42621478040532924f7a70e5a1b323bb2436240&token=1084531620&lang=zh_CN#rd)\n\n29.[CVPR'23｜DepGraph：任意架构的结构化剪枝，CNN、Transformer、GNN等都适用！](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640147&idx=2&sn=39cc866ba1a473294e799e85f60729ef&chksm=ec1249aadb65c0bcb1b441f15bae5ea297c5b454207ce5e944eb6a32fe4e07ce0c50a08fcca8&token=1084531620&lang=zh_CN#rd)\n\n30.[CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247640028&idx=1&sn=b7222d4a4a842c00475606bbc25a59b9&chksm=ec124e25db65c733ba8ad3230b6bf0ed1d57c16528ec3ec17811c796dcdeb491170de8fdd573&token=1084531620&lang=zh_CN#rd)\n\n31.[CVPR 2023｜一键去除视频闪烁，该研究提出了一个通用框架](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639967&idx=3&sn=a72655384a4605697665a4cc0de97dc5&chksm=ec124e66db65c770c36a441f9aa36c5779e10a257a6d49fcd0abefad15413d639c6abed46376&token=1084531620&lang=zh_CN#rd)\n\n32.[CVPR2023｜TriDet:高效时序动作检测网络，刷新三个数据集SOTA！](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639877&idx=2&sn=5a70c3460be1c17952bea674db45a935&chksm=ec124ebcdb65c7aab83727e4014751623980f19d04381916ca7c4e3901f86fdcb6a90ce8eec7&token=1084531620&lang=zh_CN#rd)\n\n33.[CVPR'23｜3D模型分割新方法！不用人工标注，只需一次训练，未标注类别也能识别](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639783&idx=1&sn=7c963096d2786a4d4e130c9039b40e34&chksm=ec124f1edb65c608e71ad8e84c10d69f7b732d434a21900c452107b15a8fa3decae69af93f07&token=1084531620&lang=zh_CN#rd)\n\n34.C[VPR 2023｜标注500类，检测7000类！清华大学等提出通用目标检测算法UniDetector](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639748&idx=2&sn=5b7f454d0dea151aef7b0b76e191970c&chksm=ec124f3ddb65c62b8456542e5b783460d8100aa33f8a6bdafe9980c2b20594cf30af2259d9cc&token=1084531620&lang=zh_CN#rd)\n\n35.[CVPR 2023｜用于半监督目标检测的知识蒸馏方法](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639720&idx=2&sn=4c17e54d8041aae9e8bb1e9159527b8f&chksm=ec124f51db65c647b2d6087cbfb8f25af585afbef1a87da877491963060f880255ab6fc95a87&token=1084531620&lang=zh_CN#rd)\n\n36.[CVPR 2023｜基于图像质量评价的半监督水下复原](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639601&idx=2&sn=a7dd06652130259824ce2ae2b4836ae7&chksm=ec124fc8db65c6deaf4fb24c1b29679780080c771ec1d29374987b13fd239762c8095dc50261&token=1084531620&lang=zh_CN#rd)\n\n37.[CVPR 2023｜基于多层多尺度重建任务的MIM改进算法](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639498&idx=2&sn=2e845d17ef70e5a963c7de7d170919e9&chksm=ec124c33db65c5257b8a942295a86ec1c944a2d0f1ff84b11aba6f970fdcfeaf918ac2f4498d&token=1084531620&lang=zh_CN#rd)\n\n38.[CVPR 2023｜神经网络家族添新丁！小步快跑追求高速的FasterNet](https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247639403&idx=1&sn=7f90001b6eaee605ff4edc5dad0c3f9d&chksm=ec124c92db65c584bd0770bcfb5137d3f6538dc3e2bb354d702153f78374bde6b05f1c3db522&token=1084531620&lang=zh_CN#rd)\n\n\n\n\n
\n\n\n\n
\n\n \n\n# 4. CVPR2023论文分享\n\n[极市直播回放第108期丨潘梓正：模型部署新范式—可缝合神经网络（CVPR 2023）]([url](https://www.cvmart.net/community/detail/7439))\n\n
\n\n
\n\n \n\n# 5. To do list\n\n* CVPR2023 Workshop\n" }, { "path": "CVPR2024.md", "content": "* 推荐阅读：
\n * [ICCV2021/2019/2017 论文/代码/解读/直播合集](https://github.com/extreme-assistant/ICCV2021-Paper-Code-Interpretation)\n * [2020-2021年计算机视觉综述论文汇总](https://github.com/extreme-assistant/survey-computer-vision)\n * [国内外优秀的计算机视觉团队汇总](https://github.com/extreme-assistant/Awesome-CV-Team)\n * [CVPR2024论文解读大盘点（持续更新中）](https://www.cvmart.net/community/detail/8557)\n \n ------\n\n# CVPR2024（PaperReading／直播分享等）\n\n官网链接：https://cvpr.thecvf.com/Conferences/2024\n
\n\n\n## 1.[CVPR2024论文解读汇总](https://www.cvmart.net/community/detail/8557)\n\n## 2.CVPR2024技术直播分享\n\n一直以来，为让大家更好地了解学界业界优秀的论文和工作，极市已邀请了超过100位技术大咖嘉宾来进行线上的直播分享，\n也非常欢迎有兴趣的小伙伴们自荐或者推荐更多优秀的技术嘉宾来到极市分享优秀的工作，与大家一起交流学习。\n\n🌟🌟有兴趣的小伙伴欢迎添加极市运营的微信（ID：fengcall19）\n\n### 1.极市直播第128期｜CVPR2024：视觉新骨干ViT-CoMer在密集预测任务中的应用\n* ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
\nGitHub：https://github.com/Traffic-X/ViT-CoMer ｜论文地址：https://arxiv.org/pdf/2403.07392.pdf
\n* 讲者介绍：夏春龙，西安交通大学硕士，现任百度资深算法工程师，交通感知算法负责人，研究方向有交通感知大模型、2&3D检测和轻量化框架设计等。
\n* 回放地址：https://www.bilibili.com/video/BV1nm421E73X/
\n* PPT下载：https://www.cvmart.net/community/detail/8582
\n![kEQNAv](https://cdn.jsdelivr.net/gh/cynthia-yawian/images@main/uPic/kEQNAv.png)\n

\n\n\n### 2.极市直播第129期｜CVPR2024 Highlight：无参数无训练的网络也能进行3D表征学习吗？\n* No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation（CVPR2024 Highlight）
\n论文地址：https://arxiv.org/abs/2404.04050 ｜ GitHub：https://github.com/yangyangyang127/Seg-NN
\n* 讲者介绍：朱向阳，香港城市大学博士在读，主要研究3D视觉分析、医疗图像处理、图像和点云处理等任务。
\n* 回放地址：https://www.bilibili.com/video/BV1kt421w7Ec/
\n* PPT下载：https://www.cvmart.net/community/detail/14391
\n![yZ8nJi](https://cdn.jsdelivr.net/gh/cynthia-yawian/images@main/uPic/yZ8nJi.png)\n

\n\n\n" }, { "path": "README.md", "content": "\n* 推荐阅读：
\n * [ICCV2021/2019/2017 论文/代码/解读/直播合集](https://github.com/extreme-assistant/ICCV2021-Paper-Code-Interpretation)\n * [2020-2021年计算机视觉综述论文汇总](https://github.com/extreme-assistant/survey-computer-vision)
\n * [国内外优秀的计算机视觉团队汇总](https://github.com/extreme-assistant/Awesome-CV-Team)\n------\n\n# cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017（Papers/Codes/Project/Paper reading）\n论文解读+技术直播汇总：https://bbs.cvmart.net/articles/4368
\n论文分类汇总：https://bbs.cvmart.net/articles/4267

\n2000~2021年历届CVPR最佳论文代码，解读等汇总：http://bbs.cvmart.net/topics/665/CVPR-Best-Paper
\n
\n\n# 目录\n\n[11. CVPR2024论文解读汇总](#10)
\n[10. CVPR2023最新信息及论文下载](#10)
\n[9. CVPR2022最新信息及论文下载](#9)
\n[8. CVPR2021论文下载/代码/解读/直播](#8)
\n[7. CVPR2021论文分方向盘点](#7)
\n[6. CVPR2020论文下载/代码/解读/直播](#6)
\n[5. CVPR2020论文分方向盘点](#5)
\n[4. CVPR2019全部论文下载/开源代码](#4)
\n[3. CVPR2019论文分方向盘点](#3)
\n[2. CVPR2019论文直播分享](#2)
\n[1. CVPR2018/CVPR2017](#1)
\n\n\n\n
\n \n\n# 11. CVPR2024论文解读汇总\n* [CVPR2024论文解读汇总](https://www.cvmart.net/community/detail/8557)\n* [CVPR2024工作直播分享汇总](https://github.com/extreme-assistant/CVPR2024-Paper-Code-Interpretation/blob/master/CVPR2024.md)\n\n\n
\n \n\n# 10.CVPR2023论文下载/代码/解读/直播(持续更新)\n* [Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等](https://github.com/extreme-assistant/CVPR2023-Paper-Code-Interpretation/blob/master/CVPR2023.md)
\n* [CVPR2023 论文解读汇总 + 技术直播汇总](https://www.cvmart.net/community/detail/7423)
\n\n\n\n
\n \n\n# 9.CVPR2022论文下载/代码/解读/直播(持续更新)\n* [Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等](https://github.com/extreme-assistant/CVPR2022-Paper-Code-Interpretation/blob/master/CVPR2022.md)
\n* [CVPR2022 论文解读汇总 + 技术直播汇总](https://bbs.cvmart.net/articles/6166)
\n* [CVPR2022 Oral论文汇总/解读（待更新）](https://www.cvmart.net/community/detail/6254)

\n\n
\n \n\n# 8.CVPR2021论文下载/代码/解读/直播(持续更新)\n* [Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等](https://github.com/extreme-assistant/CVPR2021-Paper-Code-Interpretation/blob/master/CVPR2021.md)
\n * [CVPR2021全部论文下载（共1661篇）](https://pan.baidu.com/s/1TWPkRukz9JC4Br-g_Ws5OA) 提取码：su7e\n* [CVPR2021 论文解读汇总 + 技术直播汇总](https://bbs.cvmart.net/articles/4368)
\n* [CVPR2021 Oral论文汇总/解读](https://bbs.cvmart.net/articles/4366)

\n\n
\n \n\n# 7.CVPR2021论文分方向盘点
\n\n* [一文看尽CVPR2021 2D 目标检测论文（27篇）](https://mp.weixin.qq.com/s/Ho7qtrpF9FhHGaamkQo6Lw)\n* [一文看尽CVPR2021 图像异常检测论文（6篇）](https://mp.weixin.qq.com/s/ysfwYQ3sVvXINPzBR91S7A)\n* [一文看尽CVPR2021 伪装目标检测+旋转目标检测论文（6篇）](https://mp.weixin.qq.com/s/w1jPD2AbxnENUBgfdFLFSg)\n* [CVPR2021 论文大盘点：全景分割论文汇总（共15篇）](https://mp.weixin.qq.com/s/_a0UmZSSxvVUFUGOnMrMhw)\n* [CVPR2021 论文大盘点：人员重识别汇总（共26篇）](https://bbs.cvmart.net/articles/5832)\n* [CVPR2021 论文大盘点：行人技术汇总（共7篇）](https://bbs.cvmart.net/articles/5831)\n* [CVPR2021 论文大盘点：医学影像汇总（共22篇）](https://bbs.cvmart.net/articles/5829)\n* [CVPR2021 论文大盘点：超分辨率汇总（共32篇）](https://bbs.cvmart.net/articles/5560)\n* [CVPR2021 论文大盘点：图像修复汇总（共20篇）](https://bbs.cvmart.net/articles/5824)\n* [CVPR2021 论文大盘点：图像去噪汇总（共14篇）](https://bbs.cvmart.net/articles/5828)\n* [CVPR2021 论文大盘点：去雾去模糊汇总（共14篇）](https://bbs.cvmart.net/articles/5827)\n* [CVPR2021 论文大盘点：图像视频去雨汇总（共10篇）](https://bbs.cvmart.net/articles/5826)\n* [CVPR2021 论文大盘点：文本图像汇总（共17篇）](https://bbs.cvmart.net/articles/5562)\n* [CVPR2021 论文大盘点：人脸识别汇总（共15篇）](https://bbs.cvmart.net/articles/5811)\n* [CVPR2021 论文大盘点：人脸造假检测汇总（共9篇）](https://bbs.cvmart.net/articles/5810)\n* [CVPR2021 论文大盘点：图像压缩汇总（共5篇）](https://bbs.cvmart.net/articles/5809)\n* [CVPR2021 论文大盘点：遥感与航拍影像汇总（共7篇）](https://bbs.cvmart.net/articles/5830)

\n\n
\n \n\n# 6.CVPR2020论文下载/代码/解读/直播\n* [Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等](https://github.com/extreme-assistant/cvpr2020/blob/master/CVPR2020.md#cvpr2020最新信息及论文下载贴paperscodesprojectpaperreadingdemos直播分享论文分享会等)
\n* [CVPR2020全部论文下载（共1467篇）](https://pan.baidu.com/s/1UXW6iviZ_d3wpdujNgWJSQ)
\n提取码：pun7

\n* [CVPR2020 论文解读汇总 + 技术直播汇总](https://bbs.cvmart.net/articles/3031)

\n\n
\n \n\n# 5.CVPR2020论文分方向盘点
\n* [20.CVPR 2020 论文大盘点-动作检测与动作分割（13篇）](https://bbs.cvmart.net/topics/3028)
\n* [19.CVPR 2020 论文大盘点-动作识别（21篇）](https://bbs.cvmart.net/topics/3000)
\n* [18.CVPR 2020 论文大盘点-光流（12篇）](https://bbs.cvmart.net/topics/2992)
\n* [17.CVPR 2020 论文大盘点-图像与视频检索（16篇）](https://bbs.cvmart.net/topics/2964)
\n* [16.CVPR 2020 论文大盘点-遥感与航拍影像处理识别（18篇）](https://bbs.cvmart.net/topics/2953)
\n* [15.CVPR 2020 论文大盘点-图像质量评价（7篇）](https://bbs.cvmart.net/topics/2923)
\n* [14.CVPR 2020 论文大盘点-图像修复 Inpainting （7篇）](https://bbs.cvmart.net/topics/2903)
\n* [13.CVPR 2020 论文大盘点-图像增强与图像恢复（22篇）](https://bbs.cvmart.net/topics/2902)
\n* [12.CVPR 2020 论文大盘点-去雨去雾去模糊（8篇）](https://bbs.cvmart.net/topics/2876)
\n* [11.CVPR 2020 论文大盘点-医学影像处理识别（19篇）](https://bbs.cvmart.net/topics/2855)
\n* [10.CVPR 2020 论文大盘点-抠图 Matting （3篇）](https://bbs.cvmart.net/topics/2854)
\n* [9.CVPR 2020 论文大盘点-图像分割（25篇）](https://bbs.cvmart.net/topics/2829)
\n* [8.CVPR 2020 论文大盘点-全景分割与视频目标分割（8篇）](https://bbs.cvmart.net/topics/2818)
\n* [7.CVPR 2020 论文大盘点-超分辨（21篇）](https://bbs.cvmart.net/topics/2725)
\n* [6.CVPR 2020 论文大盘点-目标检测（64篇）](https://bbs.cvmart.net/topics/2732)
\n* [5.CVPR 2020 论文大盘点-人脸技术（64篇](https://bbs.cvmart.net/topics/2720 )
\n* [4.CVPR 2020 论文大盘点-目标跟踪（33篇）](https://bbs.cvmart.net/topics/2733 )
\n* [3.CVPR 2020 论文大盘点-文本图像（16篇）](https://bbs.cvmart.net/topics/2778 )
\n* [2.CVPR 2020 论文大盘点-行人检测与重识别（33篇）](https://bbs.cvmart.net/topics/2751)
\n* [1.CVPR 2020 论文大盘点-实例分割（18篇）](https://bbs.cvmart.net/topics/2806)

\n\n\n\n\n
\n \n\n# 4.CVPR2019全部论下载/开源代码
\n\n全部1294篇
\n\n* 全部链接：http://openaccess.thecvf.com/CVPR2019.py
\n* 下载链接:
\n链接:https://pan.baidu.com/s/1dhXrWFHeKeJ1kFsKBxQzVg 密码:f53l\n* [CVPR 2019全部论文开源源码汇总Excel点这里](https://github.com/extreme-assistant/cvpr2019/blob/master/cvpr_2019_githublinks.csv)\n\n

\n\n
\n \n\n# 3.CVPR2019论文分方向盘点
\n* [CVPR 2019 论文大盘点-目标跟踪篇](http://bbs.cvmart.net/articles/523/cvpr-2019-lun-wen-da-pan-dian-mu-biao-gen-zong-pian)
\n* [CVPR 2019 论文大盘点-超分辨率篇](http://bbs.cvmart.net/topics/452/cvpr-2019-lun-wen-da-pan-dian-chao-fen-bian-lv-pian)
\n* [CVPR 2019 论文大盘点-人脸技术篇](http://bbs.cvmart.net/topics/451/cvpr-2019-lun-wen-da-pan-dian-ren-lian-ji-shu-pian)
\n* [CVPR 2019 论文大盘点—目标检测篇](https://mp.weixin.qq.com/s/l8Cfi3CIt2gqVC9i3LV6hw)
\n* [CVPR 2019 论文大盘点—文本图像篇](http://bbs.cvmart.net/topics/535/CVPR2019-Text)
\n* [CVPR2019模型剪枝论文汇总](http://bbs.cvmart.net/topics/464/cvpr-2019-gong-bu-mo-xing-jian-zhi-lun-wen-hui-zong)

\n\n
\n \n\n# 2.CVPR2019论文直播分享
\n* [微软亚研院CVPR2019线下分享会视频回放及PPT下载](http://bbs.cvmart.net/topics/609/CVPR-2019)\n* 3/28晚点云分割分享回放
[王鑫龙：联合分割点云中的实例和语义（开源，列表id 27)]()
\n* 4月18日晚目标检测分享回放
\n[CMU诸宸辰:基于Anchor-free特征选择模块的单阶目标检测(CVPR2019，列表id 88)](https://mp.weixin.qq.com/s/CvzFG63c1bTuWFSIzNSxBA)
\n* 5月9日晚单目标跟踪分享回放
[张志鹏:基于siamese网络的单目标跟踪(CVPR2019 Oral，列表id 65)](https://mp.weixin.qq.com/s/3vlVXQDh6ou8Gdhg4xY2Tg)
\n* [5月30日晚人脸识别分享回放
[邓健康-CVPR2019:ArcFace 构建高效的人脸识别系统(CVPR2019，列表id 243)](https://mp.weixin.qq.com/s/SIHFTbDc_XjbfYfpgwNYeQ)：
\n* 6月13日晚三维多人多视角姿态识别分享回放
\n[董峻廷：多视角下多人三维姿态估计 CVPR2019，列表id 106](https://mp.weixin.qq.com/s/Td510LMs3UWV_8d5kDgFYw)

\n\n\n
\n \n\n# 1.CVPR2018/CVPR2017
\n* CVPR 2018全部论文下载百度云链接：https://pan.baidu.com/s/1bhYzNz2TGijUdfPIdyEGtg
密码:gyk2\n* [**CVPR 2018论文解读汇总**](http://bbs.cvmart.net/articles/56/cvpr-2018-lun-wen-jie-du-ji-jin-190326-geng-xin)\n* CVPR 2017全部论文下载百度云链接：https://pan.baidu.com/s/1p_If8S_AAgnTlZxfzBya2w
密码:o6tu\n* [**CVPR 2017论文解读集锦**](https://zhuanlan.zhihu.com/p/27651707)

\n\n\n### 参考链接
\n* https://mp.weixin.qq.com/s/YRcajgSTJq_evwtn7ZFo4A
\n* https://github.com/hoya012/CVPR-2019-Paper-Statistics
\n* https://github.com/jonahthelion/cvpr_with_code
\n* https://github.com/amusi/daily-paper-computer-vision

\n\n\n\n" }, { "path": "cvpr2018-githublinks.csv", "content": "Paper ID,Type,Title,github,Author(s),project\r\n2709,Poster,Feature Generating Networks for Zero-Shot Learning,http://datasets.d2.mpi-inf.mpg.de/xian/cvpr18xian.zip,\"Yongqin Xian, Max Planck Institute; Tobias Lorenz, Max Planck Institute for Informatics; Bernt Schiele, MPI Informatics Germany; Zeynep Akata, University of Amsterdam\",https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/zero-shot-learning/feature-generating-networks-for-zero-shot-learning/\r\n1172,Poster,Duplex Generative Adversarial Network for Unsupervised Domain Adaptation,http://vipl.ict.ac.cn/view_database.php?id=6,\"Lanqing Hu, ICT, CAS; Meina Kan, ; Shiguang Shan, Chinese Academy of Sciences; Xilin Chen,\",\r\n255,Poster,Multi-Cue Correlation Filters for Robust Visual Tracking,https://github.com/594422814/MCCT,\"Ning Wang, USTC; Wengang Zhou, USTC; Qi Tian, ; Richang Hong, ; Meng Wang, HeFei University of Technology; Houqiang Li,\",\r\n4335,Poster,Anatomical Priors in Convolutional Networks for Unsupervised Biomedical Segmentation,https://github.com/adalca/neuron,\"Adrian Dalca, ; John Guttag, ; Mert Sabuncu, Cornell\",\r\n769,Poster,Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks,https://github.com/agrimgupta92/sgan,\"Agrim Gupta, Stanford University; Justin Johnson, Stanford University; Fei-Fei Li, Stanford University; Silvio Savarese, ; Alexandre Alahi, EPFL\",\r\n779,Poster,Unsupervised Cross-dataset Person Re-identification by Transfer Learning of Spatio-temporal Patterns,https://github.com/ahangchen/Tfusion,\"Jianming Lv, South China University of Technology; Weihang Chen, South China University of Technology; Qing Li, City University of Hong Kong; Can Yang, South China University of Technology\",\r\n533,Spotlight,Recurrent Pixel Embedding for Instance Grouping,https://github.com/aimerykong/Recurrent-Pixel-Embedding-for-Instance-Grouping,\"Shu Kong, University of California, Irvine; Charless Fowlkes, University of California, Irvine, USA\",\r\n2089,Poster,Visual Question Answering with Memory-Augmented Networks,https://github.com/AI-metrics/AI-metrics,\"Chao Ma, ; Chunhua Shen, University of Adelaide; Anthony Dick, University of Adelaide; Qi Wu, University of Adelaide; Peng Wang, The University of Adelaide; Anton Van den Hengel, University of Adelaide; Ian Reid,\",\r\n112,Poster,Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering,https://github.com/AishwaryaAgrawal/GVQA,\"Aishwarya Agrawal, Georgia Institute of Technology; Dhruv Batra, Georgia Tech; Devi Parikh, Georgia Tech; Aniruddha Kembhavi, Allen Institute for Artificial Intelligence\",\r\n1284,Spotlight,Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints,https://github.com/alexanderrichard/action-sets,\"Alexander Richard, University of Bonn; Hilde Kuehne, University of Bonn; Juergen Gall, University of Bonn, Germany\",\r\n3247,Poster,Deformable GANs for Pose-based Human Image Generation,https://github.com/AliaksandrSiarohin/pose-gan,\"Aliaksandr Siarohin , DISI, University of Trento; Enver Sangineto, University of Trento; St_hane Lathuili_e, Inria; Nicu Sebe, University of Trento\",\r\n1328,Spotlight,Tell Me Where To Look: Guided Attention Inference Network,https://github.com/alokwhitewolf/Guided-Attention-Inference-Network,\"Kunpeng Li, Northeastern University; Ziyan Wu, Siemens Corporation; Kuan-Chuan Peng, Siemens Corporation; Jan Ernst, Siemens Corporation; Yun Fu, Northeastern University\",\r\n3606,Poster,Deep Back-Projection Networks For Super-Resolution,https://github.com/alterzero/DBPN-Pytorch,\"Muhammad Haris, Toyota Technological Institute; Greg Shakhnarovich, ; Norimichi Ukita, NAIST\",\r\n3990,Spotlight,\"Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions\",https://github.com/alvinwan/shiftresnet-cifar,\"Bichen Wu, UC Berkeley; Xiangyu Yue, UC Berkeley; Alvin Wan, UC Berkeley; Peter Jin, UC Berkeley; Sicheng Zhao, UC Berkeley; Noah Golmant, UC Berkeley; Amir Gholaminejad, UC Berkeley; Joseph Gonzalez, UC Berkeley; Kurt Keutzer, UC Berkeley\",\r\n2347,Spotlight,Multi-Agent Diverse Generative Adversarial Networks,https://github.com/amiryanj/socialways,\"Viveka Kulharia, University of Oxford; Arnab Ghosh, University of Oxford; Vinay P. Namboodiri, Indian Institute of Technology Kanpur; Phil Torr, Oxford; Puneet Kumar Dokania, University of Oxford\",\r\n561,Poster,CSGNet: Neural Shape Parser for Constructive Solid Geometry,https://github.com/AN313/deformable,\"Gopal Sharma, University of Massachusetts; Subhransu Maji, ; Rishabh Goyal, Indian Institute of Technology, Kanpu; Difan Liu, UMass Amherst; Evangelos Kalogerakis, UMass\",\r\n4083,Spotlight,Extreme 3D Face Reconstruction: Looking Past Occlusions,https://github.com/anhttran/extreme_3d_faces,\"Anh Tran, USC; Tal Hassner, Open Univ Israel; Iacopo Masi, USC; G_ard Medioni,\",\r\n1594,Spotlight,Sketch-a-Classifier: Sketch-based Photo Classifier Generation,https://github.com/aniket03/keras_compact_bilnear_CNN,\"Conghui Hu, Queen Mary University of Londo; Da Li, ; Yi-Zhe Song, ; Tao Xiang, Queen Mary University of London; Timothy Hospedales, University of Edinburgh\",\r\n3012,Poster,Pixar: Real-time 3D Object Detection from Point Clouds,https://github.com/ankita-kalra/PIXOR,\"Bin Yang, Uber ATG, UofT; Wenjie Luo, Uber ATG.; UofT; Raquel Urtasun, University of Toronto\",\r\n3890,Oral,Accurate and Diverse Sampling of Sequences based on a ``Best of Many'' Sample Objective,https://github.com/apratimbhattacharyya18/CGM_BestOfMany,\"Apratim Bhattacharyya, MPI Informatics; Mario Fritz, MPI, Saarbrucken, Germany; Bernt Schiele, MPI Informatics Germany\",\r\n1584,Spotlight,PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image,https://github.com/art-programmer/PlaneNet,\"Chen Liu, WUSTL; Jimei Yang, ; Duygu Ceylan, ; Ersin Yumer, Argo AI; Yasutaka Furukawa,\",\r\n985,Poster,PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning,https://github.com/arunmallya/packnet,\"Arun Mallya, UIUC; Lana Lazebnik,\",\r\n2253,Poster,\u001aZero-Shot\u001a Super-Resolution using Deep Internal Learning,https://github.com/assafshocher/ZSSR,\"Assaf Shocher, Weizmann institut of Science; Michal Irani, Weizmann Institute of Science; Nadav Cohen, Institute for Advanced Study\",\r\n3882,Poster,Analysis of Hand Segmentation in the Wild,https://github.com/aurooj/Hand-Segmentation-in-the-Wild,\"Aisha Urooj, University of Central Florida; Ali Borji, UCF\",\r\n2342,Spotlight,Multi-Content GAN for Few-Shot Font Style Transfer,https://github.com/azadis/MC-GAN,\"Samaneh Azadi, UC Berkeley; Matthew Fisher, Adobe; Vladimir Kim, Adobe Research; Zhaowen Wang, Adobe; Eli Shechtman, Adobe Research; Trevor Darrell, UC Berkeley, USA\",\r\n1240,Spotlight,Answer with Grounding Snippets: Focal Visual-Text Attention for Visual Question Answering,https://github.com/badripatro/DVQA,\"Junwei Liang, Carnegie Mellon University; Lu Jiang, ; Liangliang Cao, ; Alexander Hauptmann,\",\r\n3451,Poster,Differential Attention for Visual Question Answering,https://github.com/badripatro/DVQA,\"Badri Patro, IIT Kanpur; Vinay P. Namboodiri, Indian Institute of Technology Kanpur\",\r\n2640,Poster,\"ICE-BA: Efficient, Consistent and Efficient Bundle Adjustment for Visual-Inertial SLAM\",https://github.com/baidu/ICE-BA,\"Haomin Liu, Baidu; Mingyu Chen, Baidu; Guofeng Zhang, Zhejiang University; Hujun Bao, Zhejiang University; Yingze Bao, Baidu LLC\",\r\n1247,Poster,Repulsion Loss: Detecting Pedestrians in a Crowd,https://github.com/bailvwangzi/repulsion_loss_ssd,\"Xinlong Wang, Tongji University; Tete Xiao, Peking University; Yuning Jiang, Megvii inc.; Shuai Shao, Megvii; Jian Sun, ; Chunhua Shen, University of Adelaide\",\r\n1978,Oral,Synthesizing Images of Humans in Unseen Poses,https://github.com/balakg/posewarp-cvpr2018,\"Guha Balakrishnan, MIT; Adrian Dalca, ; Amy Zhao, MIT; Fredo Durand, ; John Guttag,\",\r\n3918,Poster,Visual Feature Attribution using Wasserstein GANs,https://github.com/baumgach/vagan-code,\"Christian Baumgartner, ETH Zurich; Lisa Koch, ETH Zurich; Kerem Tezcan, ETH Zurich; Jia Xi Ang, ETH Zurich; Ender Konukoglu, ETH Zurich\",\r\n4254,Poster,Eye In-Painting with Exemplar Generative Adversarial Networks,https://github.com/bdol/exemplar_gans,\"Brian Dolhansky, Facebook; Cristian Canton Ferrer, Facebook\",https://bdol.github.io/exemplar_gans/\r\n3362,Poster,Convolutional Sequence to Sequence Model for Human Dynamics,https://github.com/chaneyddtt/Convolutional-Sequence-to-Sequence-Model-for-Human-Dynamics,\"Chen Li, ; Zhen Zhang, National University of Singapore; Wee Sun Lee, ; Gim Hee Lee, National Univeristy of Singapore\",\r\n19,Poster,Frustum PointNets for 3D Object Detection from RGB-D Data,https://github.com/charlesq34/frustum-pointnets,\"Charles R. Qi, Stanford University; Wei Liu, ; Chenxia Wu, ; hao Su, ; Leonidas J. Guibas,\",\r\n1589,Spotlight,Multi-Task Adversarial Network for Disentangled Feature Learning,https://github.com/charliememory/Disentangled-Person-Image-Generation,\"Yang Liu, University of Cambridge; Zhaowen Wang, Adobe; Hailin Jin, ; Ian Wassell,\",\r\n1801,Spotlight,Disentangled Person Image Generation,https://github.com/charliememory/Disentangled-Person-Image-Generation,\"Liqian Ma, KU Leuven; Qianru Sun, MPI for Informatics; Stamatios Georgoulis, KU Leuven; Mario Fritz, MPI, Saarbrucken, Germany; Bernt Schiele, MPI Informatics Germany; Luc Van Gool, KU Leuven\",\r\n1321,Oral,DoubleFusion: Real-time Capture of Human Performance with Inner Body Shape from a Single Depth Sensor,https://github.com/chenguowen/-,\"Tao Yu, Beihang University; Zerong Zheng, Tsinghua University; Kaiwen Guo, Google; Jianhui Zhao, Beihang University; Qionghai Dai, ; Hao Li, ; Gerard Pons-Moll, Max Planck for Informatics; Yebin Liu, Tsinghua University\",\r\n1329,Spotlight,Residual Dense Network for Image Super-Resolution,https://github.com/chenguowen/-,\"Yulun Zhang, Northeastern University; Yapeng Tian, University of rochester; Yu Kong, Northeastern University; Bineng Zhong, Huaqiao University; Yun Fu, Northeastern University\",\r\n380,Poster,ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing,https://github.com/chenhsuanlin/spatial-transformer-GAN,\"Chen-Hsuan Lin, CMU; Ersin Yumer, Argo AI; Oliver Wang, Adobe; Eli Shechtman, Adobe Research; Simon Lucey,\",\r\n1738,Poster,Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present,https://github.com/chenxinpeng/ARNet,\"Xinpeng Chen, Wuhan University; Lin Ma, Tencent AI Lab; Wenhao Jiang, Tencent AI Lab; Jian Yao, ; Wei Liu,\",\r\n2114,Poster,Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships,https://github.com/choasup/SIN,\"Yong Liu, ICT; Ruiping Wang, Institute of Computing Technology, Chinese Academy of Sciences; Shiguang Shan, Chinese Academy of Sciences; Xilin Chen,\",\r\n711,Poster,SurfConv: Bridging 3D and 2D Convolution for RGBD Images,https://github.com/chuhang/SurfConv,\"Hang Chu, University of Toronto; Wei-Chiu Ma, MIT; Kaustav Kundu, University of Toronto; Raquel Urtasun, University of Toronto; Sanja Fidler,\",\r\n369,Oral,\"Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies\",https://github.com/CMU-Perceptual-Computing-Lab/MonocularTotalCapture,\"Hanbyul Joo, CMU; Tomas Simon, Oculus Research; Yaser Sheikh,\",\r\n458,Spotlight,Iterative Visual Reasoning Beyond Convolutions,https://github.com/coderSkyChen/Iterative-Visual-Reasoning.pytorch,\"Xinlei Chen, Facebook; Li-jia Li, Google Inc; Fei-Fei Li, Google Inc.; Abhinav Gupta,\",\r\n836,Spotlight,MegDet: A Large Mini-Batch Object Detector,https://github.com/CSAILVision/semantic-segmentation-pytorch,\"Chao Peng, Megvii; Tete Xiao, Peking University; Zeming Li, Tsinghua University, Megvii; Yuning Jiang, Megvii inc.; Xiangyu Zhang, Megvii Inc; Kai Jia, Mevii; Gang Yu, Face++; Jian Sun,\",\r\n1494,Poster,Correlation Tracking via Joint Discrimination and Reliability Learning,https://github.com/cswaynecool/DRT,\"Chong Sun, DalianUniversityofTechnology; Dong Wang, DUT; Huchuan Lu, Dalian University of Technology; Ming-Hsuan Yang, UC Merced\",\r\n1440,Poster,Learning a Single Convolutional Super-Resolution Network for Multiple Degradations,https://github.com/cszn/SRMD,\"Kai Zhang, Harbin Institute of Technology; Wangmeng Zuo, Harbin Institute of Technology; Lei Zhang, The Hong Kong Polytechnic University\",\r\n3586,Oral,Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering,https://github.com/cvlab-tohoku/Dense-CoAttention-Network,\"Nguyen Duy Kien, Tohoku University; Takayuki Okatani, Tohoku University/RIKEN AIP\",\r\n1989,Poster,Eliminating Background-bias for Robust Person Re-identification,https://github.com/cydonia999/Learning_to_See_in_the_Dark_PyTorch,\"Maoqing Tian, Sensetime Limited; Shuai Yi, The Chinese University of Hong Kong; Hongsheng Li, ; Shihua Li, ; Xuesen Zhang, SenseTime; Jianping Shi, SenseTime; Junjie Yan, ; Xiaogang Wang, Chinese University of Hong Kong\",\r\n1177,Poster,Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks,https://github.com/daixiangzi/Caffe-PCN,\"Shi Xuepeng, ICT; Shiguang Shan, Chinese Academy of Sciences; Meina Kan, ; Shuzhe Wu, Chinese Academy of Sciences; Xilin Chen,\",\r\n1352,Poster,IQA: Visual Question Answering in Interactive Environments,https://github.com/danielgordon10/thor-iqa-cvpr-2018,\"Daniel Gordon, University of Washington; Ali Farhadi, ; Aniruddha Kembhavi, Allen Institute for Artificial Intelligence; Dieter Fox, University of Washington; Mohammad Rastegari, AI2; Joe Redmon, University of Washington\",\r\n3491,Spotlight,Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation,https://github.com/danxuhk/StructuredAttentionDepthEstimation,\"Dan Xu, ; Wei Wang, University of Trento; Hao Tang, University of Trento; Nicu Sebe, University of Trento; Elisa Ricci, U. Perugia\",\r\n809,Spotlight,CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization,https://github.com/david-husx/crossview_localisation,\"Sixing Hu, NUS; Mengdan Feng, NUS; Rang Nguyen, National Uni. of Singapore; Gim Hee Lee, National University of SIngapore\",\r\n1708,Poster,Learning 3D Shape Completion from Point Clouds with Weak Supervision,https://github.com/davidstutz/cvpr2018-shape-completion,\"David Stutz, MPI Saarbruecken; Andreas Geiger, MPI Tuebingen / ETH Zuerich\",\r\n2013,Poster,Weakly Supervised Action Localization by Sparse Temporal Pooling Network,https://github.com/demianzhang/weakly-action-localization,\"Phuc Nguyen, University of California, Irvine; Ting Liu, Google, Inc.; Gautam Prasad, Google, Inc.; Bohyung Han, Seoul National University\",\r\n862,Poster,Crowd Counting via Adversarial Cross-Scale Consistency Pursuit,https://github.com/diptodip/counting,\"Zan Shen, Institute of Image Communication and Network Engineering, Shanghai Jiao Tong U; Bingbing Ni, ; Yi Xu, Shanghai Jiao Tong University; Minsi Wang, Shanghai Jiao Tong University; jianguo Hu, Minivision; Xiaokang Yang,\",\r\n1883,Spotlight,Learning deep structured active contours end-to-end,https://github.com/dmarcosg/DSAC,\"Diego Marcos, ; Devis Tuia, Wageningen University; Benjamin Kellenberger, Wageningen University and Research; Lisa Zhang, University of Toronto; Min Bai, ; Renjie Liao, ; Raquel Urtasun, University of Toronto\",\r\n4098,Spotlight,Boosting Adversarial Attacks with Momentum,https://github.com/dongyp13/Non-Targeted-Adversarial-Attacks,\"Yinpeng Dong, Tsinghua Univeristy; Fangzhou Liao, Tsinghua University; Tianyu Pang, Tsinghua University; Hang Su, Tsinghua University; Jun Zhu, Tsinghua University; Xiaolin Hu, tsinghua; Jianguo Li, Intel Lab\",\r\n4296,Poster,clcNet: Improving the Efficiency of Convolutional Neural Network using Channel Local Convolutions,https://github.com/dqzhang17/clcnet.torch,\"Dongqing Zhang, ImaginationAI LLC\",\r\n431,Poster,Learning to Compare: Relation Network for Few-Shot Learning,https://github.com/dragen1860/LearningToCompare-Pytorch,\"Flood Sung, Independent Researcher; Yongxin Yang, Queen Mary University of London; Li Zhang, Queen Mary University of London; Tao Xiang, Queen Mary University of London; Phil Torr, Oxford; Timothy Hospedales, University of Edinburgh\",\r\n1645,Poster,LiDAR-Video Driving Dataset: Learning Driving Policies Effectively,https://github.com/driving-behavior/DBNet,\"Yiping Chen, Xiamen University; Jingkang Wang, Shanghai Jiao Tong University; Cewu Lu, Shanghai Jiao Tong University; Zhipeng Luo, Xiamen University; Jonathan Li, University of Waterloo; Han Xue, Shanghai Jiao Tong University; Cheng Wang, Xiamen University\",\r\n2778,Poster,Fast Spectral Ranking for Similarity Search,https://github.com/ducha-aiki/manifold-diffusion,\"Ahmet Iscen, Inria; Yannis Avrithis, Inria; Giorgos Tolias, Czech Technical University in Prague; Teddy Furon, ; Ondrej Chum, Czech Technical University in Prague\",\r\n436,Poster,Style Aggregated Network for Facial Landmark Detection,https://github.com/D-X-Y/landmark-detection,\"Xuanyi Dong, UTS; Yan Yan, UTS; Wanli Ouyang, The University of Sydney; Yi Yang,\",\r\n319,Poster,Dynamic-Structured Semantic Propagation Network,https://github.com/DylanWusee/pointconv,\"Xiaodan Liang, Carnegie Mellon University; Hongfei Zhou, ; Eric Xing, Carnegie Mellon University\",\r\n603,Poster,Who Let The Dogs Out? Modeling Dog Behavior From Visual Data,https://github.com/ehsanik/dogTorch,\"KIANA EHSANI, 1993; Hessam Bagherinezhad, University of Washington; Joe Redmon, University of Washington; Roozbeh Mottaghi, Allen Institute for Artificial Intelligence; Ali Farhadi,\",\r\n1561,Spotlight,SeGAN: Segmenting and Generating the Invisible,https://github.com/ehsanik/SeGAN,\"KIANA EHSANI, 1993; Roozbeh Mottaghi, Allen Institute for Artificial Intelligence; Ali Farhadi,\",\r\n1024,Poster,Boosting Self-Supervised Learning via Knowledge Transfer,https://github.com/eldar/pose-tensorflow,\"Mehdi Noroozi, University of Bern; Ananthachari Kavalkazhani Vinjimoor, UMBC; Hamed Pirsiavash, ; Paolo Favaro, Bern University, Switzerland\",\r\n2781,Poster,From source to target and back: Symmetric Bi-Directional Adaptive GAN,https://github.com/engharat/SBADAGAN,\"Paolo Russo, University of Rome La Sapienza; Fabio Carlucci, University of Rome La Sapienza; Tatiana Tommasi, Italian Institute of Tecnology; Barbara Caputo, University of Rome La Sapienza, Italy\",\r\n853,Spotlight,Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions,https://github.com/ethz-asl/hfnet,\"Torsten Sattler, ETH Zurich; Will Maddern, University of Oxford; Carl Toft, Chalmers University ; Akihiko Torii, Tokyo Institute of Technology; Lars Hammarstrand, Chalmers university of technol; Erik Stenborg, Chalmers University of Tech.; Daniel Safari, DTU; Marc Pollefeys, ETH; Josef Sivic, ; Fredrik Kahl, Chalmers; Tomas Pajdla,\",\r\n182,Poster,Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points,https://github.com/fabienbaradel/glimpse_clouds,\"Fabien Baradel, LIRIS, INSA-Lyon; Christian Wolf, INRIA, INSA-Lyon, CITI, LIRIS; Julien Mille, INSA Val de Loire; Graham Taylor, University of Guelph\",\r\n2956,Spotlight,Single View Stereo Matching,https://github.com/fabiotosi92/monoResMatch-Tensorflow,\"Yue Luo, SenseTime; Jimmy Ren, SenseTime Group Limited; Mude Lin, Sun Yat-Sen University; Jiahao Pang, SenseTime Group Limited; Wenxiu Sun, SenseTime Group Limited; Hongsheng Li, ; Liang Lin,\",\r\n2172,Poster,Conditional Probability Models for Deep Image Compression,https://github.com/fab-jul/imgcomp-cvpr,\"Eirikur Agustsson, ETH Zurich; Fabian Mentzer, ETHZ Z_ich; Michael Tschannen, ETH Zurich; Radu Timofte, ETH Zurich; Luc Van Gool, KTH\",\r\n120,Poster,Learning by Asking Questions,https://github.com/facebook/MemNN,\"Ishan Misra, CMU; Ross Girshick, ; Rob Fergus, New York University; Martial Hebert, Carnegie Mellon University; Abhinav Gupta, ; Laurens van der Maaten, Facebook\",\r\n298,Spotlight,Audio to Body Dynamics,https://github.com/facebookresearch/Audio2BodyDynamics,\"Eli Shlizerman, Facebook; Lucio Dery, Stanford; Hayden Schoen, Facebook; Ira Kemelmacher,\",\r\n1209,Spotlight,Detecting and Recognizing Human-Object Interactions,https://github.com/facebookresearch/detectron,\"Georgia Gkioxari, Facebook; Ross Girshick, ; Kaiming He, ; Piotr Dollar, Facebook AI Research, Menlo Park, USA\",\r\n47,Poster,Learning to Segment Every Thing,https://github.com/facebookresearch/detectron,\"Ronghang Hu, UC Berkeley; Piotr Dollar, Facebook AI Research, Menlo Park, USA; Kaiming He, ; Trevor Darrell, UC Berkeley, USA; Ross Girshick,\",\r\n739,Poster,Non-local Neural Networks,https://github.com/facebookresearch/detectron,\"Xiaolong Wang, Carnegie Mellon University; Ross Girshick, ; Abhinav Gupta, ; Kaiming He,\",\r\n1565,Poster,Data Distillation: Towards Omni-Supervised Learning,https://github.com/facebookresearch/detectron,\"Ilija Radosavovic, Facebook AI Research; Piotr Dollar, Facebook AI Research, Menlo Park, USA; Ross Girshick, ; Georgia Gkioxari, Facebook; Kaiming He,\",\r\n747,Spotlight,Embodied Real-World Active Perception,https://github.com/facebookresearch/fair_self_supervision_benchmark,\"Fei Xia, Stanford University; Amir Zamir, Stanford, UC Berkeley; Zhi-Yang He, Stanford University; Alexander Sax, Stanford University; Jitendra Malik, ; Silvio Savarese,\",\r\n3659,Spotlight,Link and code: Fast indexing with graphs and compact regression codes,https://github.com/facebookresearch/faiss,\"Matthijs Douze, ; Herve Jegou, Facebook AI Research\",\r\n52,Poster,Embodied Question Answering,https://github.com/facebookresearch/House3D,\"Abhishek Das, Georgia Tech; Samyak Datta, Georgia Tech; Georgia Gkioxari, Facebook; Devi Parikh, Georgia Tech; Dhruv Batra, Georgia Tech; Stefan Lee, Georgia Tech\",\r\n1590,Poster,Low-shot learning with large-scale diffusion,https://github.com/facebookresearch/low-shot-with-diffusion,\"Matthijs Douze, ; Arthur Szlam, Facebook AI Research; Bharath Hariharan, Cornell University; Herve Jegou, Facebook AI Research\",\r\n3108,Spotlight,3D Semantic Segmentation with Submanifold Sparse Convolutional Networks,https://github.com/facebookresearch/SparseConvNet,\"Benjamin Graham, Facebook AI Research; Laurens van der Maaten, Facebook; Martin Engelcke, University of Oxford\",\r\n2648,Poster,A Closer Look at Spatiotemporal Convolutions for Action Recognition,https://github.com/facebookresearch/VMZ,\"Du Tran, Dartmouth College; heng Wang, ; Lorenzo Torresani, Darthmout College, USA; Jamie Ray, Facebook; Manohar Paluri,\",\r\n2354,Oral,Density Adaptive Point Set Registration,https://github.com/felja633/DARE,\"Felix J_emo Lawin, Link_ing University; Martin Danelljan, ; Fahad Khan, Computer Vision Laboratory, Linkoping University , Sweden; Per-Erik Forssen, Linkoping University; Michael Felsberg, Link_ping University\",\r\n3409,Poster,Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++,https://github.com/fidler-lab/polyrnn-pp-pytorch,\"David Acuna, University of Toronto; Huan Ling, UofT; Amlan Kar, University of Toronto; Sanja Fidler,\",\r\n3407,Poster,Robust Physical-World Attacks on Deep Learning Visual Classification,https://github.com/FloraQin0325/dvae,\"Ivan Evtimov, University of Washington; Kevin Eykholt, University of Michigan; Earlence Fernandes, University of Washington; Tadayoshi Kohno, University of Washington; Bo Li, UC Berkeley; Atul Prakash, University of Michigan; Amir Rahmati, University of Michigan; Chaowei Xiao, University of Michigan; Dawn Song, UC Berkeley\",\r\n471,Spotlight,Functional Map of the World,https://github.com/fmow/baseline,\"Gordon Christie, JHU/APL; Neil Fendley, JHU/APL; James Wilson, DigitalGlobe; Ryan Mukherjee, JHU/APL\",\r\n678,Poster,Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking,https://github.com/foolwood/RASNet,\"Qiang Wang, CASIA; Zhu Teng, Beijing Jiaotong University; Junliang Xing, Institute of Automation, Chinese Academy of Sciences; Jin Gao, Institute of Automation, Chinese Academy of Sciences; Weiming Hu,\",\r\n3702,Poster,Structured Uncertainty Prediction Networks,https://github.com/Garoe/tf_mvg,\"Garoe Dorta, University of Bath; Sara Vicente, Anthropics Technology Ltd; Lourdes Agapito, University College London; Neill Campbell, University of bath; Ivor Simpson, Anthropics Technology Ltd\",\r\n2636,Poster,Multimodal Visual Concept Learning with Weakly Supervised Techniques,https://github.com/gbouritsas/cvpr18_multimodal_weakly_supervised_learning,\"Giorgos Bouritsas, NTUA; Petros Koutras, NTUA; Athanasia Zlatintsi, NTUA; Petros Maragos, NTUA\",\r\n1296,Poster,Dynamic Few-Shot Visual Learning without Forgetting,https://github.com/gidariss/FewShotWithoutForgetting,\"Spyros Gidaris, Ecole des Ponts ParisTech ; Nikos Komodakis,\",\r\n3741,Spotlight,Learning Intelligent Dialogs for Bounding Box Annotation,https://github.com/google/intelligent_annotation_dialogs,\"Ksenia Konyushkova, Google; Jasper Uijlings, Google; Christoph Lampert, ; Vittorio Ferrari, google\",\r\n764,Poster,Image Generation from Scene Graphs,https://github.com/google/sg2im,\"Justin Johnson, Stanford University; Agrim Gupta, Stanford University; Fei-Fei Li, Stanford University\",\r\n1349,Spotlight,Unsupervised Training for 3D Morphable Model Regression,https://github.com/google/tf_mesh_renderer,\"Kyle Genova, Princeton University; Forrester Cole, Google; Aaron Maschinot, Google; Daniel Vlasic, Google; Aaron Sarna, Google; William Freeman, Google\",\r\n1926,Spotlight,Actor and Observer: Joint Modeling of First and Third-Person Videos,https://github.com/gsig/actor-observer,\"Gunnar Sigurdsson, CMU; Cordelia Schmid, INRIA Grenoble, France; Ali Farhadi, ; Abhinav Gupta, ; Karteek Alahari,\",\r\n423,Spotlight,Fast and Accurate Online Video Object Segmentation via Tracking Parts,https://github.com/guanfuchen/video_obj,\"Jingchun Cheng, Tsinghua University; Yi-Hsuan Tsai, NEC Labs America; Wei-Chih Hung, University of California, Merced; Shengjin Wang, ; Ming-Hsuan Yang, UC Merced\",\r\n633,Poster,Optimizing Video Object Detection via a Scale-Time Lattice,https://github.com/guanfuchen/video_obj,\"Kai Chen, CUHK; Jiaqi Wang, CUHK; Shuo Yang, ; Xingcheng Zhang, CUHK; Yuanjun Xiong, Amazon ; Chen-Change Loy, the Chinese University of Hong Kong; Dahua Lin, CUHK\",\r\n683,Poster,First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations,https://github.com/guiggh/hand_pose_action,\"Guillermo Garcia-Hernando, Imperial College London; Shanxin Yuan, Imperial College London; Seungryul Baek, Imperial College London; Tae-Kyun Kim, Imperial College London\",https://guiggh.github.io/publications/first-person-hands/\r\n2096,Spotlight,InLoc: Indoor Visual Localization with Dense Matching and View Synthesis,https://github.com/HajimeTaira/InLoc_demo,\"Hajime Taira, Tokyo Institute of Technology; Masatoshi Okutomi, Tokyo Institute of Technology; Torsten Sattler, ETH Zurich; Mircea Cimpoi, Czech Institute of Informatics; Marc Pollefeys, ETH; Josef Sivic, ; Tomas Pajdla, ; Akihiko Torii, Tokyo Institute of Technology\",\r\n777,Spotlight,Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,https://github.com/hey-yahei/Quantization.MXNet,\"Benoit Jacob, Google; Skirmantas Kligys, Google; Bo Chen, Google; Matthew Tang, Google; Menglong Zhu, ; Andrew Howard, Google; Dmitry Kalenichenko, Google; Hartwig Adam, Google\",\r\n49,Poster,Densely Connected Pyramid Dehazing Network,https://github.com/hezhangsprinter/DCPDN,\"He Zhang, Rutgers; Vishal Patel,\",\r\n48,Poster,Density-aware Single Image De-raining using a Multi-stream Dense Network,https://github.com/hezhangsprinter/DID-MDN,\"He Zhang, Rutgers; Vishal Patel,\",\r\n1255,Spotlight,Interleaved Structured Sparse Convolutional Neural Networks,https://github.com/homles11/IGCV3,\"Guotian Xie, Sun Yat-Sen University; Ting Zhang, Microsoft Research Asia; Jianhuang Lai, Sun Yat-sen University; Jingdong Wang, Microsoft Research\",\r\n3970,Spotlight,Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework,https://github.com/hotndy/SPAC-SupplementaryMaterials,\"Jie Chen, Nanyang Technological University; Cheen-Hau Tan, ; Junhui Hou, City University of Hong Kong; Lap-Pui Chau, Nanyang Technological University; He Li,\",\r\n304,Poster,Deep Mutual Learning,https://github.com/huanghoujing/AlignedReID-Re-Production-Pytorch,\"Ying Zhang, QMUL; Tao Xiang, Queen Mary University of London; Timothy Hospedales, University of Edinburgh; Huchuan Lu, Dalian University of Technology\",\r\n1446,Poster,Learning to Parse Wireframes in Images of Man-Made Environments,https://github.com/huangkuns/wireframe,\"Kun Huang, Shanghaitech University; Yifan Wang, ShanghaiTech University; Zihan Zhou, Penn State University; Tianjiao Ding, ; Shenghua Gao, ShanghaiTech University; Yi Ma, EECS, UC Berkeley\",\r\n3454,Poster,Deep Ordinal Regression Network for Monocular Depth Estimation,https://github.com/hufu6371/DORN,\"Huan Fu, The University of Sydney; Mingming Gong, ; Chaohui Wang, Universit Paris-Est; Kayhan Batmanghelich, University of Pittsburgh; Dacheng Tao, University of Sydney\",\r\n3580,Poster,AON: Towards Arbitrarily-Oriented Text Recognition,https://github.com/huizhang0110/AON,\"Zhanzhan Cheng, Hikvision Research Institute; Yangliu Xu, Tongji University; Fan Bai, Fudan University; Yi Niu, Hikvision Research Institute; Shiliang Pu, ; Shuigeng Zhou, Fudan University\",\r\n1287,Oral,Squeeze-and-Excitation Networks,https://github.com/hujie-frank/SENet,\"Jie Hu, Momenta; Li Shen, University of Oxford; Gang Sun, Momenta\",\r\n789,Poster,Learning from Millions of 3D Scans for Large-scale 3D Face Recognition,https://github.com/huyhieupham/3D-Face-Recognition,\"Syed Zulqarnain Gilani, The University of Western Aust; Ajmal Mian, UWA\",\r\n1620,Poster,Multi-scale Location-aware Kernel Representation for Object Detection,https://github.com/Hwang64/MLKP,\"Hao Wang, Harbin Institute of Technology; Qilong Wang, ; Mingqi Gao, Harbin Institute of Technology; Peihua Li, ; Wangmeng Zuo, Harbin Institute of Technology\",\r\n4133,Spotlight,Decoupled Networks,https://github.com/HyeonwooNoh/caffe,\"Weiyang Liu, Georgia Tech; Zhen Liu, ; Zhiding Yu, Carnegie Mellon University; Bo Dai, ; Yisen Wang, Tsinghua University; Thomas Breuel, ; James Rehg, Georgia Institute of Technology; Jan Kautz, NVIDIA; Le Song, Georgia Institute of Technology\",\r\n4093,Spotlight,Deflecting Adversarial Attacks with Pixel Deflection,https://github.com/iamaaditya/pixel-deflection,\"Aaditya Prakash, Brandeis University; Nick Moran, Bradeis University; Solomon Garber, Brandeis University; Antonella DiLillo, Brandeis University; James Storer, Brandeis University\",\r\n1724,Oral,Convolutional Neural Networks with Alternately Updated Clique,https://github.com/iboing/CliqueNet,\"Yibo Yang, Peking Univ.; Zhisheng Zhong, ; Tiancheng Shen, ; Zhouchen Lin, Peking University, China\",\r\n1016,Poster,End-to-end weakly-supervised semantic alignment,https://github.com/ignacio-rocco/weakalign,\"Ignacio ROCCO, Inria; Relja Arandjelovic, DeepMind; Josef Sivic,\",\r\n3264,Poster,Discrete-Continuous ADMM for Transductive Inference in Higher-Order MRFs,https://github.com/IKMLab/argalign1,\"Emanuel Laude, TUM; Jan-Hendrik Lange, ; Jonas Schuepfer, ; Csaba Domokos, ; Laura Leal-Taixe, Technical University of Munich; Frank Schmidt, BCAI; Bjoern Andres, ; Daniel Cremers,\",\r\n3722,Poster,Generative Modeling using the Sliced Wasserstein Distance,https://github.com/ishansd/swg,\"Ishan Deshpande, UIUC; Ziyu Zhang, Snap Research; Alex Schwing,\",\r\n2916,Poster,Fully Convolutional Attention Network for Multimodal Reasoning,https://github.com/jacobgil/pytorch-grad-cam,\"Haoqi Fan, Carnegie Mellon University; Jiatong Zhou,\",\r\n3523,Spotlight,TextureGAN: Controlling Deep Image Synthesis with Texture Patches,https://github.com/janesjanes/Pytorch-TextureGAN,\"Wenqi Xian, ; Patsorn Sangkloy, Georgia Institute of Technology; Varun Agrawal, ; Amit Raj, Georgia Institute of Technolog; Jingwan Lu, Adobe Research; Chen Fang, Adobe Research; Fisher Yu, UC Berkeley; James Hays, Georgia Tech\",\r\n1315,Poster,Fast and Accurate Single Image Super-Resolution via Information Distillation Network,https://github.com/jangsoopark/IDN-TensorFlow,\"Zheng Hui, Xidian university; Xiumei Wang, Xidian university; Xinbo Gao,\",\r\n2666,Poster,DOTA: A Large-scale Dataset for Object Detection in Aerial Images,https://github.com/jessemelpolio/Faster_RCNN_for_DOTA,\"Gui-Song Xia, Wuhan University; Xiang Bai, Huazhong University of Science and Technology; Jian Ding, Wuhan University; Zhen Zhu, Huazhong University of Science and Technology; Serge Belongie, ; Jiebo Luo, University of Rochester; Mihai Datcu, ; Marcello Pelillo, University of Venice; Liangpei Zhang, Wuhan University\",\r\n456,Poster,Generative Image Inpainting with Contextual Attention,https://github.com/JiahuiYu/generative_inpainting,\"Jiahui Yu, UIUC; Zhe Lin, Adobe Systems, Inc.; Jimei Yang, ; Xiaohui Shen, Adobe Research; Xin Lu, ; Thomas Huang,\",http://jiahuiyu.com/deepfill/\r\n475,Poster,Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization,https://github.com/jiangtaoxie/fast-MPN-COV,\"Peihua Li, ; Jiangtao Xie, ; Qilong Wang, ; Zilin Gao, Dalian University of Technology\",\r\n1699,Poster,FOTS: Fast Oriented Text Spotting with a Unified Network,https://github.com/jiangxiluning/FOTS.PyTorch,\"Xuebo Liu, SenseTime Group Ltd.; Ding Liang, Sensetime; Shi Yan, SenseTime; Dagui Chen, SenseTime; Yu Qiao, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Junjie Yan,\",\r\n3318,Oral,Surface Networks,https://github.com/jiangzhongshi/SurfaceNetworks,\"Ilya Kostrikov, NYU; Joan Bruna, New York University; Daniele Panozzo, NYU; Denis Zorin, NYU\",\r\n137,Poster,Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration,https://github.com/JianqiangRen/AAMS,\"Lu Sheng, The Chinese University of HK; Jing Shao, The Sensetime Group Limited; Ziyi Lin, SenseTime Co. Ltd.; Xiaogang Wang, Chinese University of Hong Kong\",\r\n2959,Poster,Pyramid Stereo Matching Network,https://github.com/JiaRenChang/PSMNet,\"Jia-Ren Chang, National Chiao Tung University; Yong-Sheng Chen, National Chiao Tung University\",\r\n205,Spotlight,Neural Baby Talk,https://github.com/jiasenlu/NeuralBabyTalk,\"Jiasen Lu, Georgia Institute of Technology; Jianwei Yang, Georgia Tech; Dhruv Batra, Georgia Tech; Devi Parikh, Georgia Tech\",\r\n420,Poster,Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation,https://github.com/jiwoon-ahn/psa,\"Jiwoon Ahn, DGIST; Suha Kwak, POSTECH\",\r\n3005,Poster,\"OL: Orthogonal Low-rank Embedding, A Plug and Play Geometric Loss for Deep Learning\",https://github.com/jlezama/OrthogonalLowrankEmbedding,\"Jose Lezama, Universidad de la Republica, Uruguay; Qiang Qiu, ; Pablo Mus, Universidad de la Republica, Uruguay; Guillermo Sapiro, Duke\",\r\n2514,Spotlight,Deep Learning under Privileged Information Using Heteroscedastic Dropout,https://github.com/johnwlambert/dlupi-heteroscedastic-dropout,\"Ozan Sener, Stanford University; Silvio Savarese, ; John Lambert, Stanford University\",\r\n2672,Spotlight,Person Transfer GAN to Bridge Domain Gap for Person Re-Identification,https://github.com/JoinWei-PKU/PTGAN,\"Longhui Wei, Peking University; Shiliang Zhang, Peking University; Wen Gao, ; Qi Tian,\",\r\n892,Poster,Context-aware Deep Feature Compression for High-speed Visual Tracking,https://github.com/jongwon20000/TRACA,\"Jongwon Choi, ; Hyung Jin Chang, Imperial College London; Tobias Fischer, Imperial College London; Sangdoo Yun, Seoul National University; Jiyeoup Jeong, Seoul National University; kyuewang Lee, Seoul National University; Yiannis Demiris, ; Jin Choi,\",\r\n740,Poster,Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs,https://github.com/JudyYe/zero-shot-gcn,\"Xiaolong Wang, Carnegie Mellon University; Yufei Ye, Carnegie Mellon University; Abhinav Gupta,\",\r\n4227,Poster,SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks,https://github.com/julianfaraone/SYQ,\"Julian Faraone, University of Sydney; Nicholas Fraser, Xilinx; Michaela Blott, Xilinx; Philip Leong,\",\r\n927,Poster,Adversarial Complementary Learning for Weakly Supervised Object Localization,https://github.com/junkwhinger/adversarial_complementary_learning,\"Xiaolin Zhang, University of Technology Sydey; Yunchao Wei, ; Jiashi Feng, ; Yi Yang, ; Thomas Huang,\",\r\n1282,Poster,RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints,https://github.com/kanezaki/rotationnet,\"Asako Kanezaki, National Institute of Advanced; Yasuyuki Matsushita, Osaka University; Yoshifumi Nishida, National Institute of Advanced Industrial Science and Technology (AIST)\",\r\n735,Poster,COCO-Stuff: Thing and Stuff Classes in Context,https://github.com/kazuto1011/deeplab-pytorch,\"Holger Caesar, University of Edinburgh; Jasper Uijlings, Google; Vitto Ferrari,\",\r\n598,Poster,CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation,https://github.com/kbatsos/CBMV,\"Konstantinos Batsos, Stevens Institute of Technolog; Changjiang Cai, ; Philippos Mordohai, Stevens Institute of Technology\",\r\n482,Poster,Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?,https://github.com/kenshohara/3D-ResNets-PyTorch,\"Kensho Hara, AIST; Hirokatsu Kataoka, AIST; Yutaka Satoh, AIST\",\r\n2566,Poster,Optical Flow Guided Feature: A Motion Representation for Video Action Recognition,https://github.com/kevin-ssy/Optical-Flow-Guided-Feature,\"Shuyang Sun, The University of Sydney; Zhanghui Kuang, Sense Time; Wanli Ouyang, The University of Sydney; Lu Sheng, The Chinese University of HK; Wei Zhang,\",\r\n2503,Poster,Adversarially Learned One-Class Classifier for Novelty Detection,https://github.com/khalooei/ALOCC-CVPR2018,\"Mohammad Sabokrou, Institute for Research in Fundamental Sciences (IPM); Mohammad Khalooie, ; Mahmood Fathi, ; Ehsan Adeli, Stanford University\",\r\n2650,Oral,Wasserstein Introspective Neural Networks,https://github.com/kjunelee/WINN,\"Kwonjoon Lee, UC San Diego; Weijian Xu, UC San Diego; Fan Fan, UC San Diego; Zhuowen Tu, UCSD, USA\",\r\n2906,Poster,Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal,https://github.com/kjybinp/SCGAN,\"Jifeng Wang, NJUST; Xiang Li, NJUST; Jian Yang, Nanjing University of Science and Technology\",\r\n367,Poster,Hashing as Tie-Aware Learning to Rank,https://github.com/kunhe/TALR,\"Kun He, Boston University; Fatih Cakir, Boston University; Sarah Bargal, Boston University; Stan Sclaroff, Boston University\",\r\n1944,Poster,DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks,https://github.com/KupynOrest/DeblurGAN,\"Orest Kupyn, Ukrainian Catholic University; Volodymyr Budzan, Ukrainian Catholic University; Mykola Mykhailych, UCU; Dmytro Mishkin, Czech Technical University; Jiri Matas,\",\r\n493,Poster,Enhancing the Spatial Resolution of Stereo Images using a Parallax Prior,https://github.com/kyoukuntaro/StereoSR_Jeon_CVPR2018,\"Daniel S. Jeon, KAIST; Seung-Hwan Baek, KAIST; Inchang Choi, ; Min H. Kim, KAIST\",\r\n794,Poster,Improving Object Localization with Fitness NMS and Bounded IoU Loss,https://github.com/lachlants/denet,\"Lachlan Tychsen-Smith, CSIRO (Data61); Lars Petersson,\",\r\n2813,Poster,Learning Rich Features for Image Manipulation Detection,https://github.com/LarryJiang134/Image_manipulation_detection,\"Peng Zhou, University of Maryland, Colleg; Xintong Han, University of Maryland; Vlad Morariu, University of Maryland; Larry Davis, University of Maryland, USA\",\r\n967,Spotlight,SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation,https://github.com/laughtervv/SGPN,\"Weiyue Wang, USC; Ronald Yu, ; Qiangui Huang, U of Southern CA; Ulrich Neumann, USC\",\r\n2792,Spotlight,Neural 3D Mesh Renderer,https://github.com/laughtervv/tf_neural_renderer,\"Hiroharu Kato, Univ. Tokyo; Tatsuya Harada, University of Tokyo\",\r\n954,Poster,Camera Style Adaptation for Person Re-identification,https://github.com/layumi/Person_reID_baseline_pytorch,\"Zhun Zhong, Xiamen University; Liang Zheng, University of Texas at San Ant; Zhedong Zheng, UTS; Shaozi Li, ; Yi Yang, University of Technology, Sydney\",\r\n2723,Poster,Structure from Recurrent Motion: From Rigidity to Recurrency,https://github.com/leehsiu/poseLabel,\"Xiu Li, Tsinghua University; Hongdong Li, Australian National University; Hanbyul Joo, CMU; Yebin Liu, Tsinghua University; Yaser Sheikh,\",\r\n3727,Poster,CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes,https://github.com/leeyeehoo/CSRNet-pytorch,\"Yuhong Li, Beijing Univ. of Posts & Tels; Xiaofan Zhang, UIUC; deming Chen, UIUC\",\r\n2790,Poster,Defense against adversarial attacks using guided denoiser,https://github.com/lfz/Guided-Denoise,\"Fangzhou Liao, Tsinghua University; Ming Liang, ; Yinpeng Dong, Tsinghua Univeristy; Tianyu Pang, Tsinghua University; Jun Zhu, Tsinghua University; Xiaolin Hu, Tsinghua University\",\r\n1353,Poster,Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking,https://github.com/lifeng9472/STRCF,\"Feng Li, Harbin Institute of Technology; Cheng Tian, Harbin Institute of Technology; Wangmeng Zuo, Harbin Institute of Technology; Lei Zhang, The Hong Kong Polytechnic University; Ming-Hsuan Yang, UC Merced\",\r\n2113,Spotlight,End-to-End Learning of Motion Representation for Video Understanding,https://github.com/LijieFan/tvnet,\"Lijie Fan, Tsinghua University; Wenbing Huang, Tencent AI Lab; Chuang Gan, Tsinghua University; Stefano Ermon, Stanford University; Junzhou Huang, UT Arlingtron; Boqing Gong, University of Central Florida\",\r\n2201,Poster,SO-Net: Self-Organizing Network for Point Cloud Analysis,https://github.com/lijx10/SO-Net,\"Jiaxin Li, National University of Singapore; Ben Chen, National Univ of Singapore; Gim Hee Lee, National University of SIngapore\",\r\n609,Poster,Efficient Video Object Segmentation via Network Modulation,https://github.com/linjieyangsc/video_seg,\"Linjie Yang, Snap Research; YANRAN WANG, NORTHWESTERN; Xuehan Xiong, Snapchat; Jianchao Yang, Snap; Aggelos Katsaggelos, Northwestern University\",\r\n110,Spotlight,Learning Intrinsic Image Decomposition from Watching the World,https://github.com/lixx2938/unsupervised-learning-intrinsic-images,\"Zhengqi Li, Cornell University; Noah Snavely, Cornell University / Google\",\r\n1915,Poster,Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs,https://github.com/loicland/superpoint_graph,\"Loic Landrieu, IGN; Martin Simonovsky, Universite Paris Est, ENPC\",\r\n2129,Poster,High-speed Tracking with Multi-kernel Correlation Filters,https://github.com/lukaswals/cf-trackers,\"Ming Tang, NLPR, IA, CAS; Bin Yu, NLPR, IA, CAS; Fan Zhang, BUPT; Jinqiao Wang,\",\r\nNaN,Oral,Learning Face Age Progression: A Pyramid Architecture of GANs,https://github.com/lumosity4tpj/Pytorch-Implementation-of-A-Pyramid-Architecture-of-GANs,\"Hongyu Yang, BEIHANG UNIVERSITY; Di Huang, ; Yunhong Wang, ; Anil Jain, MSU\",\r\n3392,Poster,R-FCN-3000 at 30fps: Decoupling Detection and Classification,https://github.com/MahyarNajibi/SNIPER,\"Bharat Singh, ; Hengduo Li, ; Abhishek Sharma, ; Larry Davis, University of Maryland, USA\",\r\n766,Poster,PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation,https://github.com/malavikabindhi/CS230-PointFusion,\"Danfei Xu, Stanford Univesity; dragomir Anguelov, Zoox Inc.; Ashesh Jain, Zoox Inc.\",\r\n3548,Spotlight,Boosting Domain Adaptation by Discovering Latent Domains,https://github.com/mancinimassimiliano/pytorch_wbn,\"Massimiliano Mancini, Sapienza University of Rome; Lorenzo Porzi, Mapillary Research; Samuel Rota Bul, Mapillary Research; Barbara Caputo, University of Rome La Sapienza, Italy; Elisa Ricci, U. Perugia\",\r\n573,Poster,In-Place Activated BatchNorm for Memory-Optimized Training of DNNs,https://github.com/mapillary/inplace_abn,\"Samuel Rota Bulo', Mapillary Research; Lorenzo Porzi, Mapillary Research; Peter Kontschieder,\",\r\n2812,Oral,Discriminative Learning of Latent Features for Zero-Shot Recognition,https://github.com/MARMOTatZJU/ZSLPR-TIANCHI,\"Yan Li, CASIA; Junge Zhang, ; jianguo Zhang, ; Kaiqi Huang,\",\r\n1830,Oral,Deep Learning of Graph Matching,https://github.com/mdeff/cnn_graph,\"Andrei Zanfir, IMAR and Lund University; Cristian Sminchisescu,\",\r\n643,Poster,Cascaded Pyramid Network for Multi-Person Pose Estimation,https://github.com/megvii-detection/MSPN,\"Yilun Chen, Beihang University; Zhicheng Wang, Megvii(Face++); Yuxiang Peng, Tsinghua University; Zhiqiang Zhang, HUST; Gang Yu, Face++; Jian Sun,\",\r\n1969,Spotlight,Learning to Extract a Video Sequence from a Single Motion-Blurred Image,https://github.com/MeiguangJin/Learning-to-Extract-a-Video-Sequence-from-a-Single-Motion-Blurred-Image,\"Meiguang Jin, University of Bern, Switzerlan; Givi Meishvili, University of Bern, Switzerland; Paolo Favaro, Bern University, Switzerland\",\r\n671,Poster,Dense 3D Regression for Hand Pose Estimation,https://github.com/melonwan/denseReg,\"Chengde Wan, ; Thomas Probst, ; Luc Van Gool, KTH; Angela Yao, University of Bonn\",\r\n3117,Poster,Real-Time Seamless Single Shot 6D Object Pose Prediction,https://github.com/Microsoft/singleshotpose,\"Bugra Tekin, ; Sudipta Sinha, Microsoft Research; Pascal Fua,\",\r\n3164,Poster,PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition,https://github.com/mikacuy/pointnetvlad.git,\"Mikaela Angelina Uy, NUS; Gim Hee Lee, National University of SIngapore\",\r\n2509,Poster,Resource Aware Person Re-identification across Multiple Resolutions,https://github.com/mileyan/DARENet,\"Yan Wang, Cornell university; Lequn Wang, Cornell University; yurong you, shang hai jiao tong university; xu zou, tsinghua university; Vincent Chen, cornell university; Serena Li, CORNELL UNIVERSITY; Bharath Hariharan, Cornell University; Gao Huang, ; Kilian Weinberger, Cornell University\",\r\n3905,Poster,Between-class Learning for Image Classification,https://github.com/mil-tokyo/bc_learning_image,\"Yuji Tokozume, The University of Tokyo; Yoshitaka Ushiku, ; Tatsuya Harada, University of Tokyo\",\r\n1470,Oral,Maximum Classifier Discrepancy for Unsupervised Domain Adaptation,https://github.com/mil-tokyo/MCD_DA,\"Kuniaki Saito, The University of Tokyo; Kohei Watanabe, ; Yoshitaka Ushiku, ; Tatsuya Harada, University of Tokyo\",\r\n2548,Poster,V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map,https://github.com/mks0601/V2V-PoseNet_RELEASE,\"Gyeongsik Moon, Seoul National University; Ju Yong Chang, Kwangwoon University; Kyoung Mu Lee,\",\r\n155,Spotlight,3D Hand Pose Estimation: From Current Achievements to Future Goals,https://github.com/mks0601/V2V-PoseNet_RELEASE,\"Shanxin Yuan, Imperial College London; Guillermo Garcia-Hernando, Imperial College London; Bjorn Stenger, ; Tae-Kyun Kim, Imperial College London; Gyeongsik Moon, Seoul National University; Ju Yong Chang, Kwangwoon University; Kyoung Mu Lee, ; Pavlo Molchanov, NVIDIA Research; Liuhao Ge, NTU; Junsong Yuan, Nanyang Technological University; Xinghao Chen, Tsinghua University; Guijin Wang, Tsinghua University; Fan Yang, Nara institute of science and technology; Kai Akiyama, Nara Institute of Science and Technology; Yang Wu, Nara Institute of Science and Technology; Qingfu Wan, Fudan University; Meysam Madadi, Autonomus University of Barcelona and Computer Vision Center, Barcelona, Spain; Sergio Escalera, University of Barcelona; Shile Li, Technical University of Munich; Dongheui Lee, Technical University of Munich; Iason Oikonomidis, FORTH; Antonis Argyros, FORTH\",\r\n303,Poster,Frame-Recurrent Video Super-Resolution,https://github.com/msmsajjadi/frvsr,\"Mehdi S. M. Sajjadi, Max Planck Institute for Intel; Raviteja Vemulapalli, Google; Matthew Brown,\",\r\n2496,Spotlight,Mesoscopic Facial Geometry inference Using Deep Neural Networks,https://github.com/msracver/Relation-Networks-for-Object-Detection,\"Loc Huynh, USC ICT; Weikai Chen, USC ICT; Shunsuke Saito, ; Jun Xing, ICT; Koki Nagano, Pinscreen, Inc; Andrew Jones, USC ICT; Paul Debevec, USC ICT; Hao Li,\",\r\n2498,Oral,Relation Networks for Object Detection,https://github.com/msracver/Relation-Networks-for-Object-Detection,\"Han Hu, ; Jiayuan Gu, Microsoft; Zheng Zhang, Microsoft; Jifeng Dai, Microsoft Research; Yichen Wei, Microsoft Research Asia\",\r\n2130,Spotlight,Weakly Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer,https://github.com/MVIG-SJTU/WSHP,\"Hao-Shu Fang, Shanghai Jiao Tong University; Guansong Lu, Shanghai Jiao Tong University; Xiaolin Fang, Zhejiang University; Yu-Wing Tai, Tencent YouTu; Cewu Lu, Shanghai Jiao Tong University\",\r\n473,Poster,A Two-Step Disentanglement Method,https://github.com/naamahadad/A-Two-Step-Disentanglement-Method,\"Naama Hadad, Tel Aviv University; Lior Wolf, Tel Aviv University, Israel; Moni Shahar, Tel Aviv University\",\r\n1164,Poster,Cross-Domain Weakly-Supervised Object Detection through Progressive Domain Adaptation,https://github.com/naoto0804/cross-domain-detection,\"Naoto Inoue, The University of Tokyo; Ryosuke Furuta, The University of Tokyo; Toshihiko Yamasaki, The University of Tokyo; Kiyoharu Aizawa,\",\r\n3325,Poster,Convolutional Image Captioning,https://github.com/NaskyD/convnet,\"Jyoti Aneja, UIUC; Aditya Deshpande, University of Illinois at UC; Alex Schwing,\",\r\n145,Oral,Semi-parametric Image Synthesis,https://github.com/ndrplz/semiparametric,\"Xiaojuan Qi, CUHK; Qifeng Chen, Intel Labs; Jiaya Jia, Chinese University of Hong Kong; Vladlen Koltun, Intel Labs\",\r\n1910,Poster,Neural Sign Language Translation,https://github.com/neccam/nslt,\"Necati Cihan Camgoz, CVSSP; Simon Hadfield, ; Richard Bowden, University of Surrey UK; Oscar Koller, ; Hermann Ney,\",\r\n3430,Spotlight,Learning Nested Structures in Deep Neural Networks,https://github.com/niceday15/nested-network-cifar100,\"Eunwoo Kim, Seoul National University; Chanho Ahn, Seoul National University; Songhwai Oh, Seoul National University\",\r\n2589,Spotlight,FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis,https://github.com/nitika-verma/FeaStNet,\"Nitika Verma, INRIA; Edmond Boyer, ; Jakob Verbeek,\",\r\n2630,Spotlight,Iterative Learning with Open-set Noisy Labels,https://github.com/NUAA-AL/ALiPy,\"Yisen Wang, Tsinghua University; Xingjun Ma, The University of Melbourne; Weiyang Liu, Georgia Tech; James Bailey, The University of Melbourne; Hongyuan Zha, Georgia Institute of Technology; Le Song, Georgia Institute of Technology; Shu-Tao Xia, Tsinghua University\",\r\n768,Oral,High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs,https://github.com/NVIDIA/pix2pixHD,\"Ting-Chun Wang, NVIDIA; Ming-Yu Liu, NVIDIA; Jun-Yan Zhu, UC Berkeley; Andrew Tao, NVIDIA; Bryan Catanzaro, NVIDIA; Jan Kautz, NVIDIA\",\r\n168,Spotlight,MapNet: Geometry-Aware Learning of Maps for Camera Localization,https://github.com/NVlabs/geomapnet,\"Samarth Brahmbhatt, Georgia Tech; Jinwei Gu, NVIDIA; Kihwan Kim, NVIDIA Research; James Hays, Georgia Tech; Jan Kautz, NVIDIA\",\r\n266,Oral,\"PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume\",https://github.com/NVlabs/PWC-Net,\"Deqing Sun, NVIDIA; Xiaodong Yang, NVIDIA; Ming-Yu Liu, NVIDIA; Jan Kautz, NVIDIA\",\r\n326,Oral,SPLATNet: Sparse Lattice Networks for Point Cloud Processing,https://github.com/NVlabs/splatnet,\"Hang Su, University of Massachusetts, Amherst; Varun Jampani, NVIDIA Research; Deqing Sun, NVIDIA; Evangelos Kalogerakis, UMass; Subhransu Maji, ; Ming-Hsuan Yang, UC Merced; Jan Kautz, NVIDIA\",\r\n3583,Poster,Towards dense object tracking in a 2D honeybee hive,https://github.com/oist/DenseObjectDetection,\"Katarzyna Bozek, Okinawa Institute of Science a; Laetitia Hebert, ; Alexander Mikheyev, ; Greg Stephens, OIST Graduate University and Vrije Universiteit Amsterdam\",\r\n2946,Poster,AMNet: Memorability Estimation with Attention,https://github.com/ok1zjf/amnet,\"Jiri Fajtl, Kingston University; Vasileios Argyriou, Kingston University; Dorothy Monekosso, Leeds Beckett; Paolo Remagnino, Kingston University\",\r\n435,Oral,Deep Layer Aggregation,https://github.com/osmr/imgclsmob,\"Fisher Yu, UC Berkeley; Dequan Wang, UC Berkeley; Evan Shelhamer, UC Berkeley; Trevor Darrell, UC Berkeley, USA\",\r\n2833,Poster,Ring loss: Convex Feature Normalization for Face Recognition,https://github.com/Paralysis/ringloss,\"Yutong Zheng, Carnegie Mellon University; Dipan Pal, Carnegie Mellon University; Marios Savvides,\",\r\n3761,Spotlight,Burst Denoising with Kernel Prediction Networks,https://github.com/Pavelrst/DIP_Project,\"Ben Mildenhall, UC Berkeley; Jiawen Chen, Google; Jonathan Barron, Google; Robert Carroll, Google; Dillon Sharlet, ; Ren Ng, Berkeley\",\r\n936,Poster,DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map,https://github.com/pengwangucla/DeLS-3D,\"Peng Wang, Baidu; Ruigang Yang, University of Kentucky; Binbin Cao, Baidu; Wei Xu, ; Yuanqing Lin,\",\r\n2152,Poster,Pose-Robust Face Recognition via Deep Residual Equivariant Mapping,https://github.com/penincillin/DREAM,\"Kaidi Cao, Tsinghua University; Yu Rong, CUHK; Cheng Li, SenseTime; Chen-Change Loy, the Chinese University of Hong Kong\",\r\n1163,Oral,Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering,https://github.com/peteanderson80/bottom-up-attention,\"Peter Anderson, Australian National University; Xiaodong He, ; Chris Buehler, ; Damien Teney, Unversity of Adelaide; Mark Johnson, Macquarie University; Stephen Gould, Australian National University; Lei Zhang, Microsoft\",\r\n21,Poster,Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge,https://github.com/peteanderson80/bottom-up-attention,\"Damien Teney, Unversity of Adelaide; Peter Anderson, Australian National University; Xiaodong He, ; Anton Van den Hengel, University of Adelaide\",\r\n1027,Spotlight,Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments,https://github.com/peteanderson80/Matterport3DSimulator,\"Peter Anderson, Australian National University; Qi Wu, University of Adelaide; Damien Teney, Unversity of Adelaide; Jake Bruce, ; Mark Johnson, Macquarie University; Niko S_derhauf, Queensland University of Technology; Ian Reid, ; Stephen Gould, Australian National University; Anton Van den Hengel, University of Adelaide\",\r\n3795,Poster,Learning Latent Super-Events to Detect Multiple Activities in Videos,https://github.com/piergiaj/super-events-cvpr18,\"AJ Piergiovanni, Indiana University; Michael Ryoo, Indiana University\",\r\n490,Spotlight,Interpretable Convolutional Neural Networks,https://github.com/PKUAI26/AT-CNN,\"Quanshi Zhang, UCLA; Yingnian Wu, ; Song-Chun Zhu,\",\r\n581,Spotlight,Deep Cross-media Knowledge Transfer,https://github.com/PKU-ICST-MIPL/DCKT_CVPR2018,\"Xin Huang, Peking University; Yuxin Peng, Peking University\",\r\n2462,Oral,Hybrid Camera Pose Estimation,https://github.com/pmiraldo/MinimalMultiPerspectivePose,\"Federico Camposeco, ETH; Andrea Cohen, ETH Zurich; Marc Pollefeys, ETH; Torsten Sattler, ETH Zurich\",\r\n1772,Spotlight,Learning Pose Specific Representations by Predicting different Views,https://github.com/poier/PreView,\"Georg Poier, Graz University of Technology; David Schinagl, ; Horst Bischof,\",https://poier.github.io/PreView/\r\n2195,Poster,Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification,https://github.com/pomonam/AttentionCluster,\"Xiang Long, Tsinghua University; Chuang Gan, Tsinghua University; Gerard De Melo, Rutgers University; Jiajun Wu, MIT; Xiao Liu, ; Shilei Wen, Baidu Research\",\r\n1632,Poster,Triplet-Center Loss for Multi-View 3D Object Retrieval,https://github.com/popcornell/keras-triplet-center-loss,\"Xinwei He, HUST; Yang Zhou, Huazhong University of Science and Technology; Zhichao Zhou, Huazhong University of Science and Technology; Song Bai, HUST; Xiang Bai, Huazhong University of Science and Technology\",\r\n1156,Poster,Scale-recurrent Network for Deep Image Deblurring,https://github.com/princeton-vl/DecorrelatedBN,\"Xin Tao, CUHK; Hongyun Gao, ; Yi Wang, The Chinese University of HK; Xiaoyong Shen, CUHK; Jue Wang, Megvii; Jiaya Jia, Chinese University of Hong Kong\",\r\n3217,Poster,Radially-Distorted Conjugate Translations,https://github.com/prittjam/repeats,\"James Pritts, Czech Technical University; Zuzana Kukelova, Czech Technical University in Prague; Viktor Larsson, Lund University; Ondrej Chum, Czech Technical University in Prague\",\r\n1114,Poster,A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking,https://github.com/pse-ecn/pose-sensitive-embedding,\"M. Saquib Sarfraz, KIT; Arne Schumann, KIT; Andreas Eberle, KIT; Rainer Stiefelhagen, Karlsruhe Institute of Technology\",\r\n1761,Poster,Fully Convolutional Adaptation Networks for Semantic Segmentation,https://github.com/pytorch/vision,\"Yiheng Zhang, University of Science and Technology of China; Zhaofan Qiu, University of Science and Technology of China; Ting Yao, Microsoft Research Asia; Dong Liu, Univ Sci Tech China; Tao Mei, Microsoft Research Asia\",\r\n2058,Spotlight,Recurrent Slice Networks for 3D Segmentation on Point Clouds,https://github.com/qianguih/RSNet,\"Qiangui Huang, U of Southern CA; Weiyue Wang, USC; Ulrich Neumann, USC\",\r\n999,Oral,Im2Flow: Motion Hallucination from Static Images for Action Recognition,https://github.com/rhgao/Im2Flow,\"Ruohan Gao, University of Texas at Austin; Bo Xiong, UT-Austin; Kristen Grauman,\",\r\n1502,Poster,Large Scale Fine-Grained Categorization and the Effectiveness of Domain-Specific Transfer Learning,https://github.com/richardaecn/cvpr18-inaturalist-transfer,\"Yin Cui, CornellTech; Yang Song, Google; Chen Sun, Google; Andrew Howard, Google; Serge Belongie,\",\r\n299,Poster,The Unreasonable Effectiveness of Deep Features as a Perceptual Metric,https://github.com/richzhang/PerceptualSimilarity,\"Richard Zhang, UC Berkeley; Phillip Isola, UC Berkeley; Alexei Efros, UC Berkeley; Eli Shechtman, Adobe Research; Oliver Wang, Adobe\",\r\n3995,Poster,Adversarial Feature Augmentation for Unsupervised Domain Adaptation,https://github.com/ricvolpi/adversarial-feature-augmentation,\"Riccardo Volpi, IIT (Italy); Pietro Morerio, Istituto Italiano di Tecnologi; Silvio Savarese, ; Vittorio Murino, Istituto Italiano di Tecnologia\",\r\n3074,Oral,Feature Space Transfer for Data Augmentation,https://github.com/rkwitt/GuidedAugmentation,\"Bo Liu, UCSD; Xudong Wang, UCSD; Mandar Dixit, UC San Diego; Roland Kwitt, ; Nuno Vasconcelos, UCSD, USA\",\r\n3535,Poster,Latent RANSAC,https://github.com/rlit/LatentRANSAC,\"Simon Korman, Weizmann Institute; Roee Litman, Tel-Aviv University\",\r\n325,Spotlight,Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation,https://github.com/rmalav15/Super-SloMo,\"Huaizu Jiang, UMass Amherst; Deqing Sun, NVIDIA; Varun Jampani, NVIDIA Research; Ming-Hsuan Yang, UC Merced; Erik Miller, ; Jan Kautz, NVIDIA\",\r\n2765,Oral,The Perception-Distortion Tradeoff,https://github.com/roimehrez/PIRM2018,\"Yochai Blau, Technion; Tomer Michaeli, Technion\",\r\n418,Poster,Thoracic Disease Identification and Localization with Limited Supervision,https://github.com/romanovar/evaluation_MIL,\"Zhe Li, Syracuse University; Chong Wang, Google Inc; Mei Han, Google Inc; Yuan Xue, Google; Wei Wei, Google Inc.; Li-jia Li, Google Inc; Fei-Fei Li, Google Inc.\",\r\n308,Poster,Real-world Anomaly Detection in Surveillance Videos,https://github.com/root-master/pose-anomaly-detection,\"Waqas Sultani, ; Chen Chen, University of Central Florida; Mubarak Shah, UCF\",\r\n1520,Spotlight,Attentive Generative Adversarial Network for Raindrop Removal from A Single Image,https://github.com/rui1996/DeRaindrop,\"Rui Qian, Peking University; Robby Tan, Yale-NUS College Also, Electrical and Computer Engineering, NUS; Wenhan Yang, Peking University; Jiajun Su, Peking University; Jiaying Liu, Peking University\",\r\n1946,Poster,Discriminability objective for training descriptive captions,https://github.com/ruotianluo/DiscCaptioning,\"Ruotian Luo, Toyota Technological Institute; Scott Cohen, ; Brian Price, ; Greg Shakhnarovich,\",\r\n2565,Spotlight,Deep Parametric Continuous Convolutional Neural Networks,https://github.com/rusty1s/pytorch_geometric,\"Shenlong Wang, ; Shun Da Suo, ; Wei-Chiu Ma, MIT; Raquel Urtasun, University of Toronto\",\r\n3827,Poster,SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels,https://github.com/rusty1s/pytorch_geometric,\"Matthias Fey, TU Dortmund; Jan Lenssen, TU Dortmund; Frank Weichert, TU Dortmund; Heinrich M_ler, TU Dortmund\",\r\n3805,Poster,Two-Stream Convolutional Networks for Dynamic Texture Synthesis,https://github.com/ryersonvisionlab/two-stream-dyntex-synth,\"Matthew Tesfaldet, York University; Marcus Brubaker, York University; Konstantinos Derpanis, Ryerson University\",\r\n37,Spotlight,End-to-End Dense Video Captioning with Masked Transformer,https://github.com/salesforce/densecap,\"Luowei Zhou, University of Michigan; Yingbo Zhou, Salesforce; Jason Corso, ; Richard Socher, Meta-Mind; Caiming Xiong, Salesforce\",\r\n1893,Poster,Tags2Parts: Discovering Semantic Regions from Shape Tags,https://github.com/sanjeevmk/Tags2Parts,\"Sanjeev Muralikrishnan, IIT Bombay; Vladimir Kim, Adobe Research; Siddhartha Chaudhuri, IIT Bombay\",\r\n2711,Poster,Deep Image Prior,https://github.com/saravanabalagi/deep_image_prior,\"Dmitry Ulyanov, Skoltech; Andrea Vedaldi, U Oxford; Victor Lempitsky,\",\r\n316,Poster,Excitation Backprop for RNNs,https://github.com/sbargal/Caffe-ExcitationBP-RNNs,\"Sarah Bargal, Boston University; Andrea Zunino, Istituto Italiano di Tecnologia; Donghyun Kim, Boston University; Jianming Zhang, Adobe Research; Vittorio Murino, Istituto Italiano di Tecnologia; Stan Sclaroff, Boston University\",\r\n899,Poster,Deep Extreme Cut: From Extreme Points to Object Segmentation,https://github.com/scaelles/DEXTR-PyTorch,\"Kevis-Kokitsi Maninis, ETH Zurich; Sergi Caelles, ETH Zurich; Jordi Pont-Tuset, ETHZ; Luc Van Gool, KTH\",\r\n1029,Spotlight,Fast Video Object Segmentation by Reference-Guided Mask Propagation,https://github.com/seoungwugoh/RGMP,\"Seoung Wug Oh, Yonsei Univeristy; Joon-Young Lee, ; Kalyan Sunkavalli, Adobe Systems Inc.; Seon Joo Kim, Yonsei University\",\r\n1622,Poster,MoCoGAN: Decomposing Motion and Content for Video Generation,https://github.com/sergeytulyakov/mocogan,\"Sergey Tulyakov, ; Ming-Yu Liu, NVIDIA; Xiaodong Yang, NVIDIA; Jan Kautz, NVIDIA\",\r\n5,Poster,Single-Shot Refinement Neural Network for Object Detection,https://github.com/sfzhang15/RefineDet,\"Shifeng Zhang, CBSR, NLPR, CASIA; Longyin Wen, GE Global Research Center; Xiao Bian, ; Zhen Lei, Chinese Academy of Sciences ; Stan Li,\",\r\n3656,Spotlight,CondenseNet: An Efficient DenseNet using Learned Group Convolutions,https://github.com/ShichenLiu/CondenseNet,\"Gao Huang, ; Shichen Liu, Tsinghua University; Laurens van der Maaten, Facebook; Kilian Weinberger, Cornell University\",\r\n1378,Poster,Crowd Counting with Deep Negative Correlation Learning,https://github.com/shizenglin/Deep-NCL,\"Zenglin Shi, University of Bern; Le Zhang, Advanced Digital Sciences Cent; XiaoFeng Cao, university of technology sydney; Yun Liu, Nankai University; yangdong Ye, Zhengzhou University, China; Guoyan Zheng, University of Bern\",\r\n2784,Spotlight,Path Aggregation Network for Instance Segmentation,https://github.com/ShuLiu1993/PANet,\"Shu Liu, CUHK; Lu Qi, CUHK; Haifang Qin, ; Jianping Shi, SenseTime; Jiaya Jia, Chinese University of Hong Kong\",\r\n948,Poster,Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification,https://github.com/Simon4Yan/Learning-via-Translation,\"Weijian Deng, University of Chinese Academy; Liang Zheng, University of Texas at San Ant; GUOLIANG KANG, UTS; Yi Yang, ; Qixiang Ye, ; Jianbin Jiao,\",\r\n116,Poster,Human-centric Indoor Scene Synthesis Using Stochastic Grammar,https://github.com/SiyuanQi/human-centric-scene-synthesis,\"Siyuan Qi, UCLA; Yixin Zhu, UCLA; Siyuan Huang, UCLA; Chenfanfu Jiang, ; Song-Chun Zhu,\",\r\n2257,Poster,Deep Diffeomorphic Transformer Networks,https://github.com/SkafteNicki/ddtn,\"Nicki Skafte Detlefsen, DTU; Oren Freifeld, Ben-Gurion University; Soren Hauberg, Technical University of Denmark\",\r\n2312,Poster,Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition,https://github.com/songdejia/DFL-CNN,\"Yaming Wang, University of Maryland; Vlad Morariu, University of Maryland; Larry Davis, University of Maryland, USA\",\r\n2951,Spotlight,High Performance Visual Tracking with Siamese Region Proposal Network,https://github.com/songdejia/Siamese-RPN-pytorch,\"Bo Li, SenseTime ; Wei Wu, ; Zheng Zhu, Institute of Automation, CAS; Junjie Yan,\",\r\n2541,Poster,Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing,https://github.com/speedinghzl/DSRG,\"Zilong Huang, HUST; Xinggang Wang, ; Jiasi Wang, HUST; Wenyu Liu, ; Jingdong Wang, Microsoft Research\",\r\n3284,Spotlight,Cross-modal Deep Variational Hand Pose Estimation,https://github.com/spurra/vae-hands-3d,\"Adrian Spurr, ETH Zurich; Jie Song, ETHZ; Seonwook Park, ETH Zurich; Otmar HIlliges, ETH Zurich\",https://ait.ethz.ch/projects/2018/vae_hands/\r\n1598,Poster,Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks,https://github.com/srama2512/sidekicks,\"Dinesh Jayaraman, UT Austin ; Kristen Grauman,\",\r\n3009,Poster,Efficient parametrization of multi-domain deep neural networks,https://github.com/srebuffi/residual_adapters,\"Sylvestre-Alvise Rebuffi, University of Oxford; Hakan Bilen, University of Oxford; Andrea Vedaldi, U Oxford\",\r\n3718,Oral,Ordinal Depth Supervision for 3D Human Pose Estimation,https://github.com/ssfootball04/generative_pose,\"Georgios Pavlakos, ; Xiaowei Zhou, Zhejiang University; Kostas Daniilidis, University of Pennsylvania\",\r\n744,Oral,Taskonomy: Disentangling Task Transfer Learning,https://github.com/StanfordVL/taskonomy,\"Alexander Sax, Stanford University; William Shen, ; Amir Zamir, Stanford, UC Berkeley; Jitendra Malik, ; Silvio Savarese, ; Leonidas J. Guibas,\",\r\n2494,Spotlight,Towards High Performance Video Object Detection,https://github.com/stanlee321/LightFlow-TensorFlow,\"Xizhou Zhu, ; Jifeng Dai, Microsoft Research; Lu Yuan, Microsoft Research Asia; Yichen Wei, Microsoft Research Asia\",\r\n429,Poster,Future Frame Prediction for Anomaly Detection \u001a A New Baseline,https://github.com/StevenLiuWen/ano_pred_cvpr2018,\"Wen Liu, ShanghaiTech University; Weixin Luo, Shanghaitech University; Dongze Lian, ShanghaiTech University; Shenghua Gao, ShanghaiTech University\",\r\n909,Spotlight,Unsupervised Deep Generative Adversarial Hashing Network,https://github.com/sudiptodip15/ClusterGAN,\"Kamran Ghasedi Dizaji, University of Pittsburgh; Feng Zheng, University of Pittsburgh; Najmeh Sadoughi, University of Texas at Dallas; Heng Huang, University of Pittsburgh\",\r\n2136,Poster,Encoding Crowd Interaction with Deep Neural Network for Pedestrian Trajectory Prediction,https://github.com/svip-lab/CIDNN,\"Yanyu Xu, Shanghaitech University; Zhixin Piao, ; Shenghua Gao, ShanghaiTech University\",\r\n2138,Poster,Spline Error Weighting for Robust Visual-Inertial Fusion,https://github.com/svip-lab/CIDNN,\"Hannes Ovr_, Link_ing University; Per-Erik Forssen, Linkoping University\",\r\n2895,Spotlight,Future Person Localization in First-Person Videos,https://github.com/takumayagi/fpl,\"Takuma Yagi, The University of Tokyo; Karttikeya Mangalam, IIT Kanpur; Ryo Yonetani, The University of Tokyo; Yoichi Sato, Univ of Tokyo\",\r\n842,Poster,AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks,https://github.com/taoxugit/AttnGAN,\"Tao Xu, Lehigh University; Pengchuan Zhang, ; Qiuyuan Huang, ; Han Zhang, Rutgers; Zhe Gan, ; Xiaolei Huang, Lehigh ; Xiaodong He,\",\r\n144,Spotlight,Tangent Convolutions for Dense Prediction in 3D,https://github.com/tatarchm/tangent_conv,\"Maxim Tatarchenko, Freiburg; Jaesik Park, Intel Labs; Qian-Yi Zhou, ABQ Technologies; Vladlen Koltun, Intel Labs\",\r\n2918,Spotlight,A PID Controller Approach for Stochastic Optimization of Deep Networks,https://github.com/tensorboy/PIDOptimizer,\"An Wangpeng , Tsinghua University; Haoqian Wang, Tsinghua University, Shenzhen Graduate School; Qingyun Sun, Stanford Univsersity; Jun Xu, Hong Kong Polytechnic U; QIonghai Dai, Tsinghua University; Lei Zhang, The Hong Kong Polytechnic University\",\r\n2730,Poster,Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking,https://github.com/tensorflow/models/tree/master/research/delf,\"Filip Radenovic, CTU Prague; Ahmet Iscen, Inria; Giorgos Tolias, Czech Technical University in Prague; Yannis Avrithis, Inria; Ondrej Chum, Czech Technical University in Prague\",\r\n1063,Spotlight,AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions,https://github.com/tensorflow/models/tree/master/research/object_detection,\"Chunhui Gu, Google; Chen Sun, Google; David Ross, Google Research; Carl Vondrick, Google; Caroline Pantofaru, Google; Yeqng Li, Google Inc.; Sudheendra Vijayanarasimhan, Google Research; George Toderici, Google; Susanna Ricco, Google; Rahul Sukthankar, Google Research; Cordelia Schmid, INRIA Grenoble, France; Jitendra Malik,\",\r\n1916,Spotlight,The iNaturalist Species Classification and Detection Dataset,https://github.com/tensorflow/models/tree/master/research/object_detection,\"Grant van Horn, California Institute of Technology; Oisin Mac Aodha, Caltech; Yang Song, Google; Yin Cui, CornellTech; Chen Sun, Google; Alex Shepard, iNaturalist; Hartwig Adam, Google; Pietro Perona, California Institute of Technology, USA; Serge Belongie,\",\r\n3298,Spotlight,Learning Transferable Architectures for Scalable Image Recognition,https://github.com/tensorflow/models/tree/master/research/slim,\"Barret Zoph, Google; Vijay Vasudevan, Google; Jonathon Shlens, Google; Quoc Le, Google\",\r\n2022,Spotlight,DenseASPP: Densely Connected Networks for Semantic Segmentation,https://github.com/tensorflow/tensorflow,\"Maoke Yang, DeepMotion; Kun Yu, DeepMotion; Kuiyuan Yang, DeepMotion\",\r\n642,Poster,ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices,https://github.com/tensorpack/tensorpack/tree/master/examples/ImageNetModels,\"Xiangyu Zhang, Megvii Inc; Xinyu Zhou, Megvii Technology Inc.; Mengxiao Lin, Megvii Technology Ltd.(Face++); Jian Sun,\",\r\n2769,Spotlight,Video Based Reconstruction of 3D People Models,https://github.com/thmoa/videoavatars,\"Thiemo Alldieck, TU Braunschweig; Marcus Magnor, TU Braunschweig; Weipeng Xu, MPI Informatics; Christian Theobalt, MPI Informatics; Gerard Pons-Moll, Max Planck for Informatics\",\r\n1580,Spotlight,Context Embedding Networks,https://github.com/thunlp/CANE,\"Kun ho Kim, Caltech; Oisin Mac Aodha, Caltech; Pietro Perona, California Institute of Technology, USA\",\r\n3436,Poster,Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN,https://github.com/TobiasLee/Text-Classification,\"Shuai Li, University of Wollongong; Wanqing Li, ; Chris Cook, University of Wollongong; Ce Zhu, University of Electronic Science and Technology of China; Yanbo Gao, University of Electronic Science and Technology of China\",\r\n3809,Poster,Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks,https://github.com/TomVeniat/bsn,\"Tom Veniat, Lip6 - MLIA; Ludovic Denoyer, UPMC\",\r\n1390,Poster,An End-to-End TextSpotter with Explicit Alignment and Attention,https://github.com/tonghe90/textspotter,\"Tong He, The University of Adelaide; Zhi Tian, SIAT, CAS; Weilin Huang, The University of Oxford; Chunhua Shen, University of Adelaide; Yu Qiao, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Changming Sun, CSIRO Data61\",\r\n2763,Poster,SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval,https://github.com/tosmaster/imagevision,\"Peng Xu, Beijing University of Posts an; Yongye Huang, Beijing University of Posts and Telecommunications; Tongtong Yuan, Beijing University of Posts and Telecommunications; Kaiyue Pang, QMUL; Yi-Zhe Song, ; Tao Xiang, Queen Mary University of London; Timothy Hospedales, University of Edinburgh; Zhanyu Ma, Beijing University of Posts and Telecommunications ; Jun Guo, Beijing University of Posts and Telecommunications\",\r\n203,Spotlight,Nonlinear 3D Face Morphable Model,https://github.com/tranluan/Nonlinear_Face_3DMM,\"LUAN TRAN, Michigan State University; Xiaoming Liu, Michigan State University\",\r\n1018,Spotlight,PointGrid: A Deep Network for 3D Shape Understanding,https://github.com/trucleduc/PointGrid,\"Truc Le, University of Missouri - Columbia; Ye Duan, University of Missouri - Columbia\",\r\n3333,Poster,VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection,https://github.com/tsinghua-rll/VoxelNet-tensorflow,\"Yin Zhou, Lawrence Berkeley National Lab; Oncel Tuzel,\",\r\n136,Spotlight,LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation,https://github.com/twhui/LiteFlowNet,\"Tak-Wai Hui, The Chinese University of Hong Kong; Chen-Change Loy, the Chinese University of Hong Kong; Xiaoou Tang, Chinese University of Hong Kong\",\r\n106,Poster,Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation,https://github.com/twni2016/OrganSegRSTN_PyTorch,\"Qihang Yu, Peking University; Lingxi Xie, UCLA; Yan Wang, JHU; Yuyin Zhou, JHU; Elliot Fishman, ; Alan Yuille, JHU\",\r\n1453,Oral,Learning to Find Good Correspondences,https://github.com/ucbdrive/hd3,\"Kwang Moo Yi, EPFL; Eduard Trulls, ; Yuki Ono, Sony; Vincent Lepetit, TU Graz; Mathieu Salzmann, EPFL; Pascal Fua,\",\r\n1251,Poster,PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection,https://github.com/Ugness/PiCANet-Implementation,\"Nian Liu, Northwestern Polytechnical University; Junwei Han, Northwestern Polytechnical U.; Ming-Hsuan Yang, UC Merced\",\r\n618,Poster,Feedback-prop: Convolutional Neural Network Inference under Partial Evidence,https://github.com/uvavision/feedbackprop,\"Tianlu Wang, 1994; Kota Yamaguchi, CyberAgent, Inc.; Vicente Ordonez, University of Virginia\",\r\n1532,Oral,Detail-Preserving Pooling in Deep Networks,https://github.com/visinf/dpp,\"Faraz Saeedan, TU Darmstadt; Nicolas Weber, ; Michael Goesele, TU Darmstadt; Stefan Roth,\",\r\n1887,Poster,Unsupervised Domain Adaptation with Similarity-Based Classifier,https://github.com/VisionLearningGroup/DA_Detection,\"Pedro Pinheiro, EPFL\",\r\n3599,Poster,Learning Less is More - 6D Camera Localization via 3D Surface Regression,https://github.com/vislearn/LessMore,\"Eric Brachmann, TU Dresden; Carsten Rother, University of Heidelberg\",\r\n3262,Poster,An Unsupervised Learning Model for Deformable Medical Image Registration,https://github.com/voxelmorph/voxelmorph,\"Guha Balakrishnan, MIT; Adrian Dalca, ; Amy Zhao, MIT; Mert Sabuncu, Cornell; John Guttag,\",\r\n2777,Poster,Appearance-and-Relation Networks for Video Classification,https://github.com/wanglimin/ARTNet,\"Limin Wang, ETH Zurich; Wei Li, Google; Wen Li, ETH; Luc Van Gool, KTH\",\r\n1435,Spotlight,Learning to Adapt Structured Output Space for Semantic Segmentation,https://github.com/wasidennis/AdaptSegNet,\"Yi-Hsuan Tsai, NEC Labs America; Wei-Chih Hung, University of California, Merced; Samuel Schulter, NEC Labs; Kihyuk Sohn, NEC Laboratories America; Ming-Hsuan Yang, UC Merced; Manmohan Chandraker, NEC Labs America\",\r\n2335,Poster,SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis,https://github.com/wchen342/SketchyGAN,\"Wengling Chen, Georgia Institute of Technolog; James Hays, Georgia Tech\",\r\n3225,Poster,Image Super-resolution via Dual-state Recurrent Neural Networks,https://github.com/WeiHan3/dsrn,\"Wei Han, UIUC; Shiyu Chang, ; Ding Liu, UIUC; Michael Witbrock, ; Thomas Huang,\",\r\n3524,Spotlight,Rethinking Feature Distribution for Loss Functions in Image Classification,https://github.com/WeitaoVan/L-GM-loss,\"Weitao Wan, Tsinghua University; Yuanyi Zhong, UIUC; Tianpeng Li, Tsinghua University; Jiansheng Chen, Tsinghua University\",\r\n178,Poster,Salient Object Detection Driven by Fixation Prediction,https://github.com/wenguanwang/ASNet,\"Wenguan Wang, Beijing Institute of Technology; Jianbing Shen, Beijing Institute of Technolog; Xingping Dong, Beijing Institute of Technology; Ali Borji, UCF\",\r\n353,Poster,Revisiting Video Saliency: A Large-scale Benchmark and a New Model,https://github.com/wenguanwang/DHF1K,\"Wenguan Wang, Beijing Institute of Technology; Jianbing Shen, Beijing Institute of Technolog; Fang Guo, Beijing Institute of Technology; Ming-Ming Cheng, Nankai University; Ali Borji, UCF\",\r\n425,Poster,Min-Entropy Latent Model for Weakly Supervised Object Detection,https://github.com/WinFrand/MELM,\"Fang Wan, UCAS; Pengxu Wei, ; Jianbin Jiao, ; Zhenjun Han, ; Qixiang Ye,\",\r\n542,Poster,Fast End-to-End Trainable Guided Filter,https://github.com/wuhuikai/DeepGuidedFilter,\"Huikai Wu, CASIA; Shuai Zheng, EBay; Junge Zhang, ; Kaiqi Huang, National Laboratory of Pattern Recognition\",\r\n1330,Poster,Look at Boundary: A Boundary-Aware Face Alignment Algorithm,https://github.com/wywu/LAB,\"Wayne Wu, SenseTime; Chen Qian, SenseTime; Shuo Yang, ; Quan Wang, SenseTime\",\r\n3180,Poster,Leveraging Unlabeled Data for Crowd Counting by Learning to Rank,https://github.com/xialeiliu/CrowdCountingCVPR18,\"Xialei Liu, Computer Vision Center of UAB; Joost van de Weijer, Computer Vision Center Barcelona; Andrew Bagdanov, Computer Vision Center, Barcelona\",\r\n574,Poster,Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks,https://github.com/xialuxi/arcface-caffe,\"Zhenhua Feng, University of Surrey; Muhammad Awais, university of surrey; Josef Kittler, ; Patrik Huber, University of Surrey; Xiaojun Wu, Jiangnan University\",\r\n2905,Spotlight,Smooth Neighbors on Teacher Graphs for Semi-supervised Learning,https://github.com/xinmei9322/SNTG,\"Yucen Luo, Tsinghua University; Jun Zhu, Tsinghua University; Mengxi Li, Tsinghua University; Yong Ren, Tsinghua University; Bo Zhang,\",\r\n695,Poster,Recovering Realistic Texture in Image Super-resolution by Spatial Feature Modulation,https://github.com/xinntao/BasicSR,\"Xintao Wang, CUHK University; Ke Yu, CUHK; Chao Dong, Sensetime Co. Ltd ; Chen-Change Loy, the Chinese University of Hong Kong\",\r\n2837,Poster,GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation,https://github.com/xjqi/GeoNet,\"Xiaojuan Qi, CUHK; Renjie Liao, ; Zhengzhe Liu, CUHK; Raquel Urtasun, University of Toronto; Jiaya Jia, Chinese University of Hong Kong\",\r\n406,Spotlight,VITON: An Image-based Virtual Try-on Network,https://github.com/xthan/VITON,\"Xintong Han, University of Maryland; Zuxuan Wu, University of Maryland; Zhe Wu, University of Maryland; Ruichi Yu, ; Larry Davis, University of Maryland, USA\",\r\n3602,Spotlight,xUnit: Learning a Spatial Activation Function for Efficient Image Restoration,https://github.com/yang-song/APDLM,\"Idan Kligvasser, Technion; Tamar Rott Shaham, Technion; Tomer Michaeli, Technion\",\r\n847,Poster,End-to-End Deep Kronecker-Product Matching for Person Re-identification,https://github.com/YantaoShen/kpm_rw_person_reid,\"Yantao Shen, CUHK; Tong Xiao, The Chinese University of HK; Hongsheng Li, ; Shuai Yi, The Chinese University of Hong Kong; Xiaogang Wang, Chinese University of Hong Kong\",\r\n874,Poster,Deep Group-shuffling Random Walk for Person Re-identification,https://github.com/YantaoShen/kpm_rw_person_reid,\"Yantao Shen, CUHK; Hongsheng Li, ; Tong Xiao, The Chinese University of HK; Shuai Yi, The Chinese University of Hong Kong; Dapeng Chen, CUHK; Xiaogang Wang, Chinese University of Hong Kong\",\r\n3022,Spotlight,Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics,https://github.com/yaringal/multi-task-learning-example,\"Alex Kendall, ; Yarin Gal, University of Cambridge; Roberto Cipolla,\",\r\n1644,Spotlight,Separating Style and Content for Generalized Style Transfer,https://github.com/ycjing/Character-Stylization,\"Yexun Zhang, Shanghai Jiao Tong University; Ya Zhang, ; Wenbin Cai,\",\r\n1535,Spotlight,Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation,https://github.com/ycliu93/CDRD,\"Yen-Cheng Liu, National Taiwan University; Yu-Ying Yeh, National Taiwan University; Tzu-Chien Fu, Northwestern University; Wei-Chen Chiu, National Chiao Tung University; Sheng-De Wang, National Taiwan University; Yu-Chiang Frank Wang, Academia Sinica\",\r\n632,Poster,Learning a Discriminative Feature Network for Semantic Segmentation,https://github.com/ycszen/TorchSeg,\"Changqian Yu, HUST; Jingbo Wang, Peking University; Chao Peng, Megvii; Changxin Gao, HUST; Gang Yu, Face++; Nong Sang,\",\r\n823,Poster,Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation,https://github.com/yhjo09/VSR-DUF,\"Younghyun Jo, Yonsei University; Seoung Wug Oh, Yonsei Univeristy; JaeYeon Kang, Yonsei Univ.; Seon Joo Kim, Yonsei University\",\r\n2205,Poster,CartoonGAN: Generative Adversarial Networks for Photo Cartoonization,https://github.com/Yijunmaverick/CartoonGAN-Test-Pytorch-Torch,\"Yang Chen, Tsinghua University; Yu-Kun Lai, Cardiff University; Yong-Jin Liu,\",\r\n3324,Spotlight,Deep Depth Completion of a Single RGB-D Image,https://github.com/yindaz/DeepCompletionRelease,\"Yinda Zhang, Princeton; Thomas Funkhouser, Princeton\",\r\n3006,Spotlight,Compressed Video Action Recognition,https://github.com/yjxiong/caffe,\"Chao-Yuan Wu, UT Austin; Manzil Zaheer, Carnegie Mellon University; Hexiang Hu, ; R. Manmatha, A9; Alexander Smola, ; Philipp Krahenbuhl,\",\r\n488,Spotlight,Zero-Shot Sketch-Image Hashing,https://github.com/ymcidence/Zero-Shot-Sketch-Image-Hashing,\"Yuming Shen, University of East Anglia; Li Liu, University of East Anglia; Fumin Shen, ; Ling Shao, University of East Anglia\",\r\n2823,Spotlight,Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network,https://github.com/ypxie/HDGan,\"Zizhao Zhang, University of Florida; Yuanpu Xie, University of Florida; Lin Yang,\",\r\n1320,Poster,Domain Adaptive Faster R-CNN for Object Detection in the Wild,https://github.com/yuhuayc/da-faster-rcnn,\"Yuhua Chen, CVL@ETHZ; Wen Li, ETH; Luc Van Gool, KTH\",\r\n630,Spotlight,Learning a Toolchain for Image Restoration,https://github.com/yuke93/RL-Restore,\"Ke Yu, CUHK; Chao Dong, Sensetime Co. Ltd ; Chen-Change Loy, the Chinese University of Hong Kong\",\r\n2337,Poster,Grounding Referring Expressions in Images by Variational Context,https://github.com/yuleiniu/vc,\"Hanwang Zhang, Columbia University; Yulei Niu, Renmin University of China; Shih-Fu Chang,\",\r\n1248,Poster,PU-Net: Point Cloud Upsampling Network,https://github.com/yulequan/PU-Net,\"Lequan Yu, The Chinese University of Hong; XIANZHI LI, CUHK; Chi-Wing Fu, ; Daniel Cohen-Or, ; Pheng-Ann Heng,\",\r\n3662,Oral,StarGAN: Unified Generative Adversarial Networks for Controllable Multi-Domain Image-to-Image Translation,https://github.com/yunjey/StarGAN,\"Jaegul Choo, Korea University; Jung-Woo Ha, NAVER Corp; Munyoung Kim, The College of New Jersey; Yunjey Choi, Korea University; Minje Choi, Korea University; Sunghun Kim, HKUST\",\r\n932,Oral,Unsupervised Discovery of Object Landmarks as Structural Representations,https://github.com/YutingZhang/lmdis-rep,\"Yuting Zhang, University of Michigan; Yijie Guo, University of Michigan; Yixin Jin, ; Yijun Luo, University of Michigan; Zhiyuan He, University of Michigan; Honglak Lee, University of Michigan, USA\",\r\n2819,Poster,\"GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose\",https://github.com/yzcjtr/GeoNet,\"Zhichao Yin, Sensetime Group Limited; Jianping Shi, SenseTime\",\r\n3838,Poster,Learning Depth from Monocular Videos using Direct Methods,https://github.com/yzcjtr/GeoNet,\"Chaoyang Wang, Carnegie Mellon University; Jose Buenaposada, Universidad Rey Juan Carlos; Rui Zhu, Carnegie Mellon University; Simon Lucey,\",\r\n1552,Poster,Global versus Localized Generative Adversarial Nets,https://github.com/z331565360/Localized-GAN,\"Guo-Jun Qi, University of Central Florida; Liheng Zhang, University of Central Florida; Hao Hu, University of Central Florida\",\r\n1757,Poster,Learning to Promote Saliency Detectors,https://github.com/zengxianyu/lps,\"Yu Zeng, Dalian University of Technology; Huchuan Lu, Dalian University of Technology; Lihe Zhang, Dalian University of Technology; Mengyang Feng, DUT, student; Ali Borji, UCF\",\r\n76,Oral,Context Encoding for Semantic Segmentation,https://github.com/zhanghang1989/PyTorch-Encoding,\"Hang Zhang, Rutgers University; Kristin Dana, ; Jianping Shi, SenseTime; Zhongyue Zhang, Amazon; Xiaogang Wang, Chinese University of Hong Kong; Ambrish Tyagi, Amazon; Amit Agrawal, Amazon\",\r\n1410,Spotlight,Collaborative and Adversarial Network for Unsupervised domain adaptation,https://github.com/zhangweichen2006/iCAN,\"Weichen Zhang, The University of Sydney; Wanli Ouyang, The University of Sydney; Dong Xu, ; Wen Li, ETH\",\r\n1235,Poster,Progressive Attention Guided Recurrent Network for Salient Object Detection,https://github.com/zhangxiaoning666/PAGR,\"Xiaoning Zhang, Dalian University of Technolog; TIANTIAN WANG, Dalian University of Technolog; Jinqing Qi, ; Huchuan Lu, Dalian University of Technology\",\r\n2603,Spotlight,Cascade R-CNN: Delving into High Quality Object Detection,https://github.com/zhaoweicai/cascade-rcnn,\"Zhaowei Cai, UC San Diego; Nuno Vasconcelos, UCSD, USA\",\r\n2629,Spotlight,LEGO: Learning Edge with Geometry all at Once by Watching Videos,https://github.com/zhenheny/LEGO,\"Zhenheng Yang, ; Peng Wang, Baidu; Yang Wang, Baidu USA; Wei Xu, ; Ram Nevatia,\",\r\n801,Spotlight,Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination,https://github.com/zhirongw/lemniscate.pytorch,\"Zhirong Wu, UC Berkeley; Yuanjun Xiong, Amazon ; Stella Yu, UC Berkeley / ICSI; Dahua Lin, CUHK\",\r\n1193,Spotlight,Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks,https://github.com/zhjwustc/cvpr18_rnn_deblur_matcaffe,\"Jiawei Zhang, City University of Hong Kong; Jinshan Pan, UC Merced; Jimmy Ren, SenseTime Group Limited; Yibing Song, Tencent AI Lab; Linchao Bao, Tencent AI Lab; Rynson Lau, City University of Hong Kong; Ming-Hsuan Yang, UC Merced\",\r\n974,Spotlight,Weakly Supervised Instance Segmentation using Class Peak Response,https://github.com/ZhouYanzhao/PRM,\"Yanzhao Zhou, UCAS, China; Yi Zhu, UCAS; Qixiang Ye, ; Qiang Qiu, ; Jianbin Jiao,\",\r\n2517,Poster,Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks,https://github.com/zjuchenlong/sp-aen.cvpr18,\"Long Chen, ZJU; Hanwang Zhang, Columbia University; Jun Xiao, ZJU; Wei Liu, ; Shih-Fu Chang,\",\r\n1859,Spotlight,FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors,https://github.com/ZoieMo/Multi-task,\"Yu Chen, NUST; Ying Tai, Tencent; Xiaoming Liu, Michigan State University; Chunhua Shen, University of Adelaide; Jian Yang, Nanjing University of Science and Technology\",\r\n409,Poster,LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image,https://github.com/zouchuhang/LayoutNet,\"Chuhang Zou, UIUC; Alex Colburn, Zillow Group Inc.; Qi Shan, Zillow Group; Derek Hoiem,\",\r\n147,Poster,Interactive Image Segmentation with Latent Diversity,https://paperswithcode.com/paper/deep-interactive-object-selection#code,\"Zhuwen Li, Intel Labs; Qifeng Chen, Intel Labs; Vladlen Koltun, Intel Labs\",\r\n756,Poster,End-to-end Recovery of Human Shape and Pose,https://paperswithcode.com/paper/end-to-end-recovery-of-human-shape-and-pose#code,\"Angjoo Kanazawa, University of Maryland; Michael Black, Max Planck Institute for Intelligent Systems; David Jacobs, University of Maryland; Jitendra Malik,\",\r\n3633,Poster,Learning Face Age Progression: A Pyramid Architecture of GANs,https://paperswithcode.com/paper/learning-face-age-progression-a-pyramid#code,\"Hongyu Yang, BEIHANG UNIVERSITY; Di Huang, ; Yunhong Wang, ; Anil Jain, MSU\",\r\n687,Spotlight,Hand PointNet: 3D Hand Pose Estimation using Point Sets,https://sites.google.com/site/geliuhaontu/HandPointNet.zip?attredirects=0&d=1,\"Liuhao Ge, NTU; Junwu Weng, Nanyang Technological Univ.; Yujun Cai, NTU; Junsong Yuan, Nanyang Technological University\",https://sites.google.com/site/geliuhaontu/home/cvpr2018\r\n12,Oral,DensePose: Multi-Person Dense Human Pose Estimation In The Wild,,\"Alp Guler, INRIA; Natalia Neverova, Facebook AI Research; Iasonas Kokkinos, FAIR/UCL\",\r\n27,Spotlight,Shape from Shading through Shape Evolution,,\"Dawei Yang, University of Michigan; Jia Deng,\",\r\n53,Spotlight,TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays,,\"Xiaosong Wang, NIH; Yifan Peng, NIH NLM; Le Lu, Nvidia Corp; Zhiyong Lu, ; Ronald Summers,\",\r\n103,Spotlight,\"Real-World Repetition Estimation by Div, Grad and Curl\",,\"Tom Runia, University of Amsterdam; Cees Snoek, University of Amsterdam; Arnold Smeulders, University of Amsterdam, Netherlands\",\r\n128,Oral,Augmented Skeleton Space Transfer for Depth-based Hand Pose Estimation,,\"Seungryul Baek, Imperial College London; Kwang In Kim, University of Bath; Tae-Kyun Kim, Imperial College London\",\r\n142,Spotlight,Multi-view Harmonized Bilinear Network for 3D Object Recognition,,\"Tan Yu, Nanyang Technological Univ; Jingjing Meng, ; Junsong Yuan, Nanyang Technological University\",\r\n167,Spotlight,BlockDrop: Dynamic Inference Paths in Residual Networks,,\"Zuxuan Wu, University of Maryland; Tushar Nagarajan, University of Texas at Austin; Abhishek Kumar, ; Steven Rennie, ; Larry Davis, University of Maryland, USA; Kristen Grauman, ; Rogerio Feris, IBM\",\r\n181,Oral,Practical Block-wise Neural Network Architecture Generation,,\"Zhao Zhong, Institute of Automation,CAS; Junjie Yan, ; Wei Wu, ; Jing Shao, The Sensetime Group Limited; cheng-lin Liu,\",\r\n185,Oral,Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning,,\"Qi Wu, University of Adelaide; Peng Wang, ; Chunhua Shen, University of Adelaide; Ian Reid, ; Anton Van den Hengel, University of Adelaide\",\r\n242,Spotlight,Few-Shot Image Recognition by Predicting Parameters from Activations,,\"Siyuan Qiao, Johns Hopkins University; Chenxi Liu, JHU; Wei Shen, Shanghai University; Alan Yuille,\",\r\n270,Oral,Illuminant Spectra-based Source Separation Using Flash Photography,,\"Zhuo Hui, Carnegie Mellon University; Kalyan Sunkavalli, Adobe Systems Inc.; Sunil Hadap, ; Aswin Sankaranarayanan, Carnegie Mellon University\",\r\n281,Spotlight,Tracking Multiple Objects Outside the Line of Sight using Speckle Imaging,,\"Brandon Smith, University of Wisconsin-Madiso; Matthew O'Toole, Stanford University; Mohit Gupta, Wisconsin\",\r\n350,Spotlight,Deep Adversarial Metric Learning,,\"Yueqi Duan, Tsinghua University; Wenzhao Zheng, Tsinghua University; Xudong Lin, Tsinghua University; Jiwen Lu, Tsinghua University; Jie Zhou,\",\r\n374,Spotlight,Consensus Maximization for Semantic Region Correspondences,,\"Pablo Speciale, ETH; Danda Paudel, ; Martin Oswald, ETH Zurich; Hayko Riemenschneider, Computer Vision Lab, ETH Zurich; Luc Van Gool, KTH; Marc Pollefeys, ETH\",\r\n405,Spotlight,\"Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models\",,\"Jiuxiang Gu, Nanyang Technological Universi; Jianfei Cai, ; Joty Shafiq Rayhan, ; Li Niu, Rice University; Gang Wang,\",\r\n442,Spotlight,M3: Multimodal Memory Modelling for Video Captioning,,\"Junbo Wang, Institute of Automation, Chine; Wei Wang, ; Yan Huang, ; Liang Wang, unknown; Tieniu Tan, NLPR China\",\r\n465,Spotlight,Textbook Question Answering under Teacher Guidance with Memory Networks,,\"Juzheng Li, Tsinghua University; Hang Su, Tsinghua University; Jun Zhu, Tsinghua University; Siyu Wang, ; Bo Zhang,\",\r\n483,Oral,Left-Right Comparative Recurrent Model for Stereo Matching,,\"Zequn Jie, ; Pengfei Wang, NUS; Yonggen Ling, Tencent; Bo Zhao, ; Jiashi Feng, ; Wei Liu,\",\r\n487,Oral,Analytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input,,\"Adel Bibi, KAUST; Modar Alfadly, King Abdullah University of Science and Technology; Bernard Ghanem,\",\r\n500,Spotlight,Generating Synthetic X-ray Images of a Person from the Surface Geometry,,\"Brian Teixeira, Siemens Healthineers; Vivek Singh, Siemens Healthineers; Kai Ma, Siemens Healthineers; Birgi Tamersoy, Siemens Healthineers; Terrence Chen, Siemens Healthineers; Yifan Wu, Temple University ; Elena Balashova, Princeton University; Dorin Comaniciu, Siemens Healthineers\",\r\n552,Oral,An Analysis of Scale Invariance in Object Detection - SNIP,,\"Bharat Singh, ; Larry Davis, University of Maryland, USA\",\r\n565,Oral,Finding Tiny Faces in the Wild with Generative Adversarial Network,,\"Yancheng Bai, Kaust/Iscas; Yongqiang Zhang, Harbin institute of Technology/KAUST; Mingli Ding, ; Bernard Ghanem,\",\r\n567,Spotlight,SSNet: Scale Selection Network for Online 3D Action Prediction,,\"Jun Liu, Nanyang Technological University; Amir Shahroudy, NTU Singapore; Gang Wang, ; Ling-Yu Duan, ; Alex Kot,\",\r\n568,Spotlight,Integrated facial landmark localization and super-resolution of real-world very low resolution faces in arbitrary poses with GANs,,\"Adrian Bulat, ; Georgios Tzimiropoulos,\",\r\n601,Spotlight,NISP: Pruning Networks using Neuron Importance Score Propagation,,\"Ruichi Yu, ; Ang Li, Google DeepMind; Chun-Fu (Richard) Chen, IBM T.J. Watson Research Cente; Jui-Hsin Lai, ; Vlad Morariu, University of Maryland; Xintong Han, University of Maryland; Mingfei Gao, University of Maryland; Ching-Yung Lin, ; Larry Davis, University of Maryland, USA\",\r\n668,Spotlight,Fine-grained Video Captioning for Sports Narrative,,\"Huanyu Yu, Shanghai Jiao Tong University; Shuo Cheng, SJTU; Bingbing Ni, ; Minsi Wang, Shanghai Jiao Tong University; Zhang Jian, Shanghai Jiao Tong University; Xiaokang Yang,\",\r\n732,Spotlight,Visual Question Reasoning on General Dependency Tree,,\"Qingxing Cao, Sun Yat-Sen University; Xiaodan Liang, Carnegie Mellon University; Bailin Li, SUN-YAT SEN UNIVERSITY; Liang Lin,\",\r\n736,Spotlight,GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB,,\"Franziska Mueller, MPI Informatics; Florian Bernard, MPI Informatics; Oleksandr Sotnychenko, MPI Informatics; Dushyant Mehta, MPI For Informatics; Srinath Sridhar, ; Dan Casas, MPI; Christian Theobalt, MPI Informatics\",http://handtracker.mpi-inf.mpg.de/projects/GANeratedHands/\r\n754,Spotlight,\"SfSNet : Learning Shape, Reflectance and Illuminance of Faces `in the wild'\",,\"Soumyadip Sengupta, University of Maryland; Angjoo Kanazawa, University of Maryland; Carlos Castillo, ; David Jacobs, University of Maryland\",\r\n765,Spotlight,What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets,,\"De-An Huang, Stanford University; Vignesh Ramanathan, Facebook; Dhruv Mahajan, ; Juan Carlos Niebles, Stanford University; Fei-Fei Li, Stanford University; Lorenzo Torresani, Darthmout College, USA; Manohar Paluri,\",\r\n778,Oral,\"Finding It\"\": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Video\"\"\",,\"De-An Huang, Stanford University; Shyamal Buch, Stanford University; Lucio Dery, Stanford University; Animesh Garg, Stanford University; Fei-Fei Li, Stanford University; Juan Carlos Niebles, Stanford University\",\r\n811,Spotlight,Visual Question Generation as Dual Task of Visual Question Answering,,\"Yikang Li, ; Nan Duan, Microsoft; Bolei Zhou, Massachuate Institute of Technology; Xiao Chu, Baidu; Wanli Ouyang, The University of Sydney; Xiaogang Wang, Chinese University of Hong Kong\",\r\n812,Spotlight,Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation,,\"Yunchao Wei, ; Huaxin Xiao, ; Honghui Shi, UIUC; Zequn Jie, ; Jiashi Feng, ; Thomas Huang,\",\r\n844,Spotlight,TOM-Net: Learning Transparent Object Matting from a Single Image,,\"Guanying Chen, The University of Hong Kong; Kai Han, ; Kwan-Yee Kenneth Wong, The University of Hong Kong\",\r\n878,Spotlight,Learning to Detect Features in Texture Images,,\"Linguang Zhang, Princeton University; Szymon Rusinkiewicz, Princeton University\",\r\n906,Spotlight,Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images,,\"Hao Zhou, UMD; Jin Sun, University of Maryland; Yaser Yacoob, Univ of Maryland; David Jacobs, University of Maryland\",\r\n914,Spotlight,LSTM stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH),,\"Pelin Dogan, ETH Zurich; Albert Li, Disney Research; Leonid Sigal, University of British Columbia; Markus Gross,\",\r\n984,Oral,Rotation Averaging and Strong Duality,,\"Anders Eriksson, ; Fredrik Kahl, Chalmers; Carl Olsson, Lund University; Tat-Jun Chin,\",\r\n1025,Spotlight,PPFNet: Global Context Aware Local Features for Robust 3D Point Matching,,\"Haowen Deng, Technical University of Munich; Tolga Birdal, Technical University of Munich; Slobodan Ilic, Siemens AG\",\r\n1057,Oral,Group Consistent Similarity Learning via Deep CRFs for Person Re-Identification,,\"Dapeng Chen, CUHK; Dan Xu, ; Hongsheng Li, ; Nicu Sebe, University of Trento, Italy; Xiaogang Wang, Chinese University of Hong Kong\",\r\n1097,Spotlight,Pose-Guided Photorealistic Face Rotation,,\"Yibo Hu, CRIPAC, CASIA; Xiang Wu, Institute of Automation, Chine; Bing Yu, ; Ran He, ; Zhenan Sun, CRIPAC\",\r\n1099,Spotlight,Automatic 3D Indoor Scene Modeling from Single Panorama,,\"Yang Yang, University of Delaware; Shi Jin, ShanghaiTech University; Ruiyang Liu, ; Sing Bing Kang, Microsoft Research; Jingyi Yu, University of Delaware, USA\",\r\n1101,Spotlight,SobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion,,\"Miroslava Slavcheva, Siemens AG; Maximilian Baust, TUM; Slobodan Ilic, Siemens AG\",\r\n1128,Oral,3D-RCNN: Instance-level 3D Scene Understanding via Render-and-Compare\u001a,,\"Abhijit Kundu, Georgia Institute of Technology; Yin Li, Georgia Tech; James Rehg, Georgia Institute of Technology\",\r\n1129,Spotlight,FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds,,\"Yaoqing Yang, Carnegie Mellon University; Chen Feng, MERL; Yiru Shen, Clemson University; Dong Tian, Mitsubishi Electric Research Laboratories\",\r\n1139,Spotlight,Unsupervised Textual Grounding: Linking Words to Image Concepts,,\"Raymond Yeh, UIUC; Minh Do, University of Illinois at Urbana-Champaign; Alex Schwing,\",\r\n1202,Spotlight,Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective,,\"Jing Zhang, ; Tong Zhang, Australian National University; Yuchao Dai, Australian National University; Mehrtash Harandi, Australian National University; Richard Hartley, Australian National University Australia\",\r\n1203,Spotlight,NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning,,\"Alexander Richard, University of Bonn; Hilde Kuehne, University of Bonn; Ahsan Iqbal, University of Bonn; Juergen Gall, University of Bonn, Germany\",\r\n1249,Spotlight,Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF,,\"Linchao Bao, Tencent AI Lab; Baoyuan Wu, Tencent AI Lab; Wei Liu,\",\r\n1276,Oral,Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation,,\"Henghui Ding, Nanyang Technological University; Xudong Jiang, Nanyang Technological University; Bing Shuai, ; Ai Qun Liu, Nanyang Technological University; Gang Wang,\",\r\n1280,Spotlight,VITAL: VIsual Tracking via Adversarial Learning,,\"Yibing Song, Tencent AI Lab; Chao Ma, ; Xiaohe Wu, Harbin Institute of technology; Lijun Gong, City University of Hong Kong; Linchao Bao, Tencent AI Lab; Wangmeng Zuo, Harbin Institute of Technology; Chunhua Shen, University of Adelaide; Rynson Lau, City University of Hong Kong; Ming-Hsuan Yang, UC Merced\",\r\n1289,Spotlight,Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning,,\"Jingwen Wang, SCUT; Wenhao Jiang, Tencent AI Lab; Lin Ma, Tencent AI Lab; Wei Liu, ; Yong Xu, South China University of Technology\",\r\n1317,Spotlight,Low-Latency Video Semantic Segmentation,,\"Yule Li, Ict; Jianping Shi, SenseTime; Dahua Lin, CUHK\",\r\n1324,Spotlight,Lean Multiclass Crowdsourcing,,\"Grant van Horn, California Institute of Technology; Pietro Perona, California Institute of Technology, USA; Serge Belongie,\",\r\n1356,Spotlight,Low-shot Learning from Imaginary Data,,\"Yu-Xiong Wang, Carnegie Mellon University; Ross Girshick, ; Martial Hebert, ; Bharath Hariharan, Cornell University\",\r\n1363,Spotlight,Partial Transfer Learning with Selective Adversarial Networks,,\"Zhangjie Cao, Tsinghua University; Mingsheng Long, Tsinghua University; Jianmin Wang,\",\r\n1449,Spotlight,A Variational U-Net for Conditional Appearance and Shape Generation,,\"Ekaterina Sutter, HCI, IWR,Heidelberg University; Patrick Esser, Heidelberg University; Bjorn Ommer, Heidelberg\",\r\n1458,Oral,Actor and Action Video Segmentation from a Sentence,,\"Kirill Gavrilyuk, University of Amsterdam; Amir Ghodrati, University of Amsterdam; zhenyang Li, University of Amsterdam; Cees Snoek, University of Amsterdam\",\r\n1567,Spotlight,Deep Photo Enhancer: Unsupervised Learning of Image Enhancement from Photographs with GANs,,\"Yu-Sheng Chen, National Taiwan University; Yu-Ching Wang, National Taiwan University; Man-Hsin Kao, National Taiwan University; Yung-Yu Chuang, National Taiwan University\",\r\n1593,Spotlight,Learning from Synthetic Data: Semantic Segmentation using Generative Adversarial Networks,,\"Swami Sankaranarayanan, University of Maryland; Yogesh Balaji, University of Maryland; Arpit Jain, ; Ser-Nam Lim, GE Global Research; Rama Chellappa, University of Maryland, USA\",\r\n1597,Spotlight,VizWiz Grand Challenge: Answering Visual Questions from Blind People,,\"Danna Gurari, University of Texas at Austin; Qing Li, USTC; Abigale Stangl, ; Anhong Guo, ; Chi Lin, ; Kristen Grauman, ; Jiebo Luo, University of Rochester; Jeffrey Bigham,\",\r\n1621,Spotlight,Multistage Adversarial Losses for Pose-Based Human Image Synthesis,,\"Chenyang Si, Institute of Automation, Chine; Wei Wang, ; Liang Wang, unknown; Tieniu Tan, NLPR China\",\r\n1643,Spotlight,Environment Upgrade Reinforcement Learning for Non-differentiable Multi-stage Pipelines,,\"Shuqin Xie, SJTU; Cewu Lu, Shanghai Jiao Tong University; Zitian Chen, Fudan University; Chao Xu, Shanghai Jiao Tong University\",\r\n1676,Spotlight,Learning Spatial-Aware Regressions for Visual Tracking,,\"Chong Sun, DalianUniversityofTechnology; Dong Wang, DUT; Huchuan Lu, Dalian University of Technology; Ming-Hsuan Yang, UC Merced\",\r\n1688,Spotlight,Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation,,\"Pengyuan Lyu, Huazhong University of Science and Technology; Cong Yao, Huazhong University of Science and Technology; Wenhao Wu, Megvii; Shuicheng Yan, National University of Singapore; Xiang Bai, Huazhong University of Science and Technology\",\r\n1697,Spotlight,Optimal Structured Light a la Carte,,\"Parsa Mirdehghan, University of Toronto; Wenzheng Chen, UofT; Kyros Kutulakos,\",\r\n1719,Spotlight,RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials,,\"Despoina Paschalidou, MPI Tuebingen; Carolin Schmitt, MPI Tuebingen; Osman Ulusoy, microsoft corporation; Luc Van Gool, KTH; Andreas Geiger, MPI Tuebingen / ETH Zuerich\",\r\n1775,Spotlight,A Papier-M_h Approach to Learning 3D Surface Generation,,\"Thibault GROUEIX, _ole des ponts ParisTech; Bryan Russell, Adobe; Mathew Fisher, Adobe Systems; Mathieu Aubry, ; Vladimir Kim, Adobe Research\",\r\n1919,Spotlight,Teaching Categories to Human Learners with Visual Explanations,,\"Oisin Mac Aodha, Caltech; Shihan Su, Caltech; Yuxin Chen, Caltech; Pietro Perona, California Institute of Technology, USA; Yisong Yue,\",\r\n1928,Spotlight,Going from Image to Video Saliency: Augmenting Image Salience with Dynamic Attentional Push,,\"Siavash Gorji, McGill University; James Clark, McGill University\",\r\n1956,Spotlight,Language-Based Image Editing with Recurrent attentive Models,,\"Yelong Shen, Microsoft; Jianbo Chen, UC Berkeley; Jianfeng Gao, ; JingJing Liu, Microsoft; Xiaodong Liu, Microsoft\",\r\n1957,Spotlight,SBNet: Sparse Block_ Network for Fast Inference,,\"Mengye Ren, Uber ATG; Andrei Pokrovsky, Uber ATG; Bin Yang, Uber ATG, UofT; Raquel Urtasun, University of Toronto\",\r\n1959,Spotlight,Learning Compositional Visual Concepts with Mutual Consistency,,\"Yunye Gong, Cornell University; Srikrishna Karanam, Siemens Corporate Technology; Ziyan Wu, Siemens Corporation; Kuan-Chuan Peng, Siemens Corporation; Jan Ernst, Siemens Corporation; Peter Doerschuk, Cornell University\",\r\n1988,Oral,Neural Inverse Kinematics for Unsupervised Motion Retargetting,,\"Ruben Villegas, University of Michigan; Jimei Yang, ; Duygu Ceylan, ; Honglak Lee, University of Michigan, USA\",\r\n2044,Spotlight,iVQA: Inverse Visual Question Answering,,\"Feng Liu, Southeast Univeristy; Tao Xiang, Queen Mary University of London; Timothy Hospedales, University of Edinburgh; Wankou Yang, Southeast University; Changyin Sun, Southeast University\",\r\n2082,Spotlight,\"The Easy, The Medium and The Hard: Adapting Across Varied Domain Shifts\",,\"Swami Sankaranarayanan, University of Maryland; Yogesh Balaji, University of Maryland; Carlos Castillo, ; Rama Chellappa, University of Maryland, USA\",\r\n2108,Spotlight,Weakly Supervised Coupled Networks for Visual Sentiment Analysis,,\"Jufeng Yang, Nankai University; Dongyu She, ; Yu-Kun Lai, Cardiff University; Paul Rosin, ; Ming-Hsuan Yang, UC Merced\",\r\n2161,Spotlight,Viewpoint-aware Video Summarization,,\"Atsushi Kanehira, University of Tokyo; Luc Van Gool, KTH; Yoshitaka Ushiku, ; Tatsuya Harada, University of Tokyo\",\r\n2194,Oral,Direction-aware Spatial Context Features for Shadow Detection,,\"Xiaowei Hu, CUHK; Lei Zhu, ; Chi-Wing Fu, ; Jing Qin, The Hong Kong Polytechnic University; Pheng-Ann Heng,\",\r\n2261,Spotlight,Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination,,\"Hazel Doughty, University of Bristol; Dima Damen, University of Bristol; Walterio Mayol-Cuevas,\",\r\n2266,Spotlight,HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization,,\"Bin Zhao, Northwestern Polytechnical Uni; Xuelong Li, ; Xiaoqiang Lu,\",\r\n2344,Spotlight,LIME: Live Intrinsic Material Estimation,,\"Abhimitra Meka, Max Planck Institute for Infor; Maxim Maximov, Graduate School of Computer Science, Saarland University; Michael Zollh_er, MPI Informatics; Avishek Chatterjee, Max Planck Institute for Informatics; Hans-Peter Seidel, Max Planck Institute for Informatics; Christian Richardt, University of Bath; Christian Theobalt, MPI Informatics\",\r\n2351,Spotlight,Light field intrinsics with a deep encoder-decoder network,,\"Anna Alperovich, University of Konstanz; Ole Johannsen, University of Konstanz; Michael Strecke, University of Konstanz; Bastian Goldluecke,\",\r\n2355,Spotlight,Learning Monocular 3D Human Pose estimation on weakly-supervised Multi-view Images,,\"Helge Rhodin, epfl.ch; J_g Sp_ri, Balgrist; Isinsu Katircioglu, EPFL Lausanne, Switzerland; Victor Constantin, EPFL; Fr__ic Meyer, ; Erich M_ler, ; Mathieu Salzmann, EPFL; Pascal Fua,\",\r\n2361,Spotlight,Emotional Attention: A Study of Image Sentiment and Visual Attention,,\"Shaojing Fan, National University of Singapo; Zhiqi Shen, National University of Singapore; Ming Jiang, University of Minnesota; Bryan Koenig, Southern Utah University; Juan Xu, University of Minnesota; Mohan Kankanhalli, National University of Singapore; Qi Zhao,\",\r\n2406,Spotlight,Through-Wall Human Pose Estimation Using Radio Signals,,\"Mingmin Zhao, MIT; Tianhong Li, MIT; Mohammad Abu Alsheikh, MIT; Yonglong Tian, Massachusetts Institute of Technology; Hang Zhao, MIT; Antonio Torralba, MIT; Dina Katabi, MIT\",\r\n2425,Spotlight,Learning Multi-grid Generative ConvNets by Minimal Contrastive Divergence,,\"Ruiqi Gao, UCLA; Yang Lu, University of California Los Angeles; Junpei Zhou, ; Song-Chun Zhu, ; Yingnian Wu,\",\r\n2464,Spotlight,Alive Caricature from 2D to 3D,,\"Qianyi Wu, USTC; Juyong Zhang, University of Science and Technology of China; Yu-Kun Lai, Cardiff University; Jianmin Zheng, Nanyang Technological University; Jianfei Cai,\",\r\n2485,Spotlight,Self-Supervised Feature Learning by Learning to Spot Artifacts,,\"Simon Jenni, Universit_ Bern; Paolo Favaro, Bern University, Switzerland\",\r\n2523,Oral,\"Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects\",,\"Md Amirul Islam, University of Manitoba; Mahmoud Kalash, University of Manitoba; Neil D. B. Bruce, University of Manitoba\",\r\n2567,Oral,Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View,,\"Shuran Song, Princeton ; Andy Zeng, Princeton; Angel Chang, Stanford University; Manolis Savva, ; Silvio Savarese, ; Thomas Funkhouser, Princeton\",\r\n2583,Spotlight,AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation,,\"Jogendra Kundu, Indian Institute of Science; Phani Krishna Uppala, Indian Institute of Science; Anuj Pahuja, Indian Institute of Science; Venkatesh Babu Radhakrishnan, Indian Institute of Science\",\r\n2612,Oral,Polarimetric Dense Monocular SLAM,,\"Luwei Yang, Simon Farser University; Feitong Tan, Simon Fraser University; Ao Li, Simon Fraser University; Zhaopeng Cui, Simon Fraser University; Yasutaka Furukawa, ; Ping Tan,\",\r\n2657,Spotlight,Learning Globally Optimized Object Detector via Policy Gradient,,\"Yongming Rao, ; Dahua Lin, CUHK; Jiwen Lu, Tsinghua University\",\r\n2708,Spotlight,Multimodal Explanations: Justifying Decisions and Pointing to the Evidence,,\"Lisa Anne Hendricks, UC Berkeley; Trevor Darrell, UC Berkeley, USA; Anna Rohrbach, UC Berkeley; Zeynep Akata, University of Amsterdam; Bernt Schiele, MPI Informatics Germany; Marcus Rohrbach, UC Berkeley; Dong Huk Park, UC Berkeley\",\r\n2726,Spotlight,Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN,,\"Deepak Babu Sam, Indian Institute of Science; Neeraj Sajjan, Indian Institute of Science; Venkatesh Babu Radhakrishnan, Indian Institute of Science; Mukundhan Srinivasan, NVIDIA\",\r\n2732,Spotlight,Structured Set Matching Networks for One-Shot Part Labeling,,\"Jonghyun Choi, ; Jayant Krishnamurthy, Semantic Machines; Aniruddha Kembhavi, Allen Institute for Artificial Intelligence; Ali Farhadi,\",\r\n2801,Spotlight,Learning Semantic Concepts and Order for Image and Sentence Matching,,\"Yan Huang, ; Qi Wu, University of Adelaide; Liang Wang, unknown\",\r\n2805,Spotlight,Modifying Non-Local Variations Across Multiple Views,,\"Tal Tlusty, Technion; Tomer Michaeli, Technion; Tali Dekel, Google; Lihi Zelnik-Manor,\",\r\n2827,Spotlight,Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection,,\"David Novotny, Oxford University; Samuel Albanie, Oxford University; Diane Larlus, NAVER LABS Europe; Andrea Vedaldi, U Oxford\",\r\n2851,Spotlight,Empirical study of the topology and geometry of deep networks,,\"Alhussein Fawzi, ; Seyed-Mohsen Moosavi-Dezfooli, ; Pascal Frossard, ; Stefano Soatto, UCLA\",\r\n2871,Oral,Photometric Stereo in Participating Media Considering Shape-Dependent Forward Scatter,,\"Yuki Fujimura, Kyoto University; Masaaki Iiyama, Kyoto University; Atsushi Hashimoto, Kyoto University; Michihiko Minoh, Kyoto University\",\r\n2937,Spotlight,Deformation Aware Image Compression,,\"Tamar Rott Shaham, Technion; Tomer Michaeli, Technion\",\r\n2964,Spotlight,Interpret Neural Networks by Identifying Critical Data Routing Paths,,\"Yulong Wang, Tsinghua University; Hang Su, Tsinghua University; Xiaolin Hu, tsinghua\",\r\n2972,Spotlight,Beyond Gr_ner Bases: Basis Selection for Minimal Solvers,,\"Viktor Larsson, Lund University; Magnus Oskarsson, Lund University Sweden; Kalle Astroem, Lund University; Alge Wallis, ; Zuzana Kukelova, Czech Technical University in Prague; Tomas Pajdla,\",\r\n2974,Spotlight,\"A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation\",,\"Guillermo Gallego, University of Zurich; Henri Rebecq, University of Zurich; Davide Scaramuzza, University of Zurich\",\r\n2980,Spotlight,Fight ill-posedness with ill-posedness: Single-shot variational depth super-resolution from shading,,\"Bjoern Haefner, TU Munich; Yvain Queau, Technical University Munich; Thomas M_lenhoff, Technical University of Munich; Daniel Cremers,\",\r\n2989,Spotlight,Manifold Learning in Quotient Spaces,,\"_oi Mehr, LIP6; Andr Lieutier, ; Fernando Sanchez Bermudez, ; Vincent Guitteny, ; Nicolas Thome, Conservatoire national des arts et m_iers; Matthieu Cord,\",\r\n3001,Spotlight,Seeing Voices and Hearing Faces: Cross-modal biometric matching,,\"Arsha Nagrani, University of Oxford; Samuel Albanie, Oxford University; Andrew Zisserman, Oxford\",\r\n3013,Oral,\"Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net\",,\"Wenjie Luo, Uber ATG.; UofT; Bin Yang, Uber ATG, UofT; Raquel Urtasun, University of Toronto\",\r\n3021,Spotlight,Creating Capsule Wardrobes from Fashion Images,,\"Wei-Lin Hsiao, UT-Austin; Kristen Grauman,\",\r\n3025,Spotlight,Image Collection Pop-up: 3D Reconstruction and Clustering of Rigid and Non-Rigid Categories,,\"Antonio Agudo, IRI (CSIC-UPC); Francesc Moreno-Noguer, Institut de Robotica i Informatica Industrial (UPC/CSIC)\",\r\n3054,Oral,Trapping Light for Time of Flight,,\"Ruilin Xu, Columbia University; Mohit Gupta, Wisconsin; Shree Nayar, Columbia University\",\r\n3099,Oral,Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250Hz,,\"Ayush Tewari, MPI Informatics; Michael Zollh_er, MPI Informatics; Pablo Garrido, ; Florian Bernard, ; Hyeongwoo Kim, MPII; Patrick Perez, Technicolor Research; Christian Theobalt, MPI Informatics\",\r\n3124,Oral,\"CodeSLAM --- Learning a Compact, Optimisable Representation for Dense Visual SLAM\",,\"Michael Bloesch, Imperial College London; Jan Czarnowski, Imperial College London; Ronald Clark, Imperial College London; Stefan Leutenegger, Imperial College London; Andrew Davison, Imperial College London UK\",\r\n3156,Spotlight,Modeling Facial Geometry using Compositional VAEs,,\"Timur Bagautdinov, ; Chenglei Wu, Oculus; Jason Saragih, Oculus Research; Pascal Fua, ; Yaser Sheikh,\",\r\n3218,Spotlight,Deep Spatio-Temporal Random Fields for Efficient Video Segmentation,,\"Siddhartha Chandra, INRIA; Camille Couprie, Facebook Artificial Intelligence Research; Iasonas Kokkinos, FAIR/UCL\",\r\n3232,Spotlight,Robust Hough Transform Based 3D Reconstruction from Circular Light Fields,,\"Alessandro Vianello, Robert Bosch GmbH; Jens Ackermann, Robert Bosch GmbH; Maximilian Diebold, Heidelberg University; Bernd J_ne, University of Heidelberg\",\r\n3254,Spotlight,LDMNet: Low Dimensional Manifold Regularized Neural Networks,,\"Wei Zhu, Duke University; Qiang Qiu, ; Jiaji Huang, Baidu Silicon Valley AI Lab; Robert Calderbank, Duke University; Guillermo Sapiro, Duke; Ingrid Daubechies, Duke University\",\r\n3275,Oral,FlipDial: A Generative Model for Two-Way Visual Dialogue,,\"Daniela Massiceti, University of Oxford; Siddharth Narayanaswamy, University of Oxford; Puneet Kumar Dokania, University of Oxford; Phil Torr, Oxford\",\r\n3280,Spotlight,Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification,,\"Li Niu, Rice University; Ashok Veeraraghavan, Rice University; Ashutosh Sabharwal,\",\r\n3289,Oral,OATM: Occlusion Aware Template Matching by Consensus Set Maximization,,\"Simon Korman, Weizmann Institute; Mark Milam, NGC; Stefano Soatto, UCLA\",\r\n3292,Spotlight,MX-LSTM: mixing tracklets and vislets to jointly forecast trajectories and head poses,,\"Irtiza Hasan, University of Verona; Francesco Setti, ; Theodore Tsesmelis, ; Alessio Del Bue, Istituto Italiano di Tecnologia (IIT); Fabio Galasso, ; Marco Cristani, U. Verona\",\r\n3308,Spotlight,MovieGraphs: Towards Understanding Human-Centric Situations from Videos,,\"Paul Vicol, University of Toronto; Makarand Tapaswi, University of Toronto; Llu_ Castrej_, ; Sanja Fidler,\",\r\n3329,Spotlight,Lose The Views: Limited Angle CT Reconstruction via Implicit Sinogram Completion,,\"Rushil Anirudh, Lawrence Livermore National La; Hyojin Kim, Lawrence Livermore National Laboratory; Jayaraman J. Thiagarajan, LLNL; K. Aditya Mohan, Lawrence Livermore National Laboratory; Kyle Champley, Lawrence Livermore National Laboratory; Timo Bremer, Lawrence Livermore National Laboratory\",\r\n3330,Spotlight,Now You Shake Me: Towards Automatic 4D Cinema,,\"Yuhao Zhou, University of Toronto; Makarand Tapaswi, University of Toronto; Sanja Fidler,\",\r\n3340,Spotlight,\"Learning-Compression\"\" algorithms for neural net pruning\"\"\",,\"Miguel Carreira-perpinan, UC Merced; Yerlan Idelbayev, UC Merced\",\r\n3342,Spotlight,Reinforcement Cutting-Agent Learning for Video Object Segmentation,,\"Junwei Han, Northwestern Polytechnical U.; Le Yang, Northwestern Polytechnical Uni; Dingwen Zhang, ; Xiaojun Chang, Carnegie Mellon University; Xiaodan Liang, Carnegie Mellon University\",\r\n3381,Spotlight,Large-scale Distance Metric Learning with Uncertainty,,\"Qi Qian, Alibaba Group; Shenghuo Zhu, Alibaba Group; Rong Jin, Alibaba Group; Jiasheng Tang, Alibaba Group; Hao Li, Alibaba Group\",\r\n3394,Spotlight,Distributable Consistent Multi-Graph Matching,,\"Nan Hu, Stanford Unviversity; Boris Thibert, ; Leonidas J. Guibas,\",\r\n3399,Oral,VirtualHome: Simulating Household Activities via Programs,,\"Xavier Puig, MIT; Kevin Ra, ; Marko Boben, ; Jiaman Li, University of Toronto; Tingwu Wang, ; Sanja Fidler, ; Antonio Torralba, MIT\",\r\n3438,Spotlight,\"Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape from Images\",,\"Silvia Zuffi, IMATI-CNR; Angjoo Kanazawa, University of Maryland; Michael Black, Max Planck Institute for Intelligent Systems\",\r\n3482,Spotlight,A Common Framework for Interactive Texture Transfer,,\"Yifang Men, Peking University; Zhouhui Lian, ; Jianguo Xiao, Peking University\",\r\n3489,Spotlight,Guide Me: Interacting with Deep Networks,,\"Christian Rupprecht, Technische Unitversit?t M?nchen; Iro Laina, ; Federico Tombari, ; Nassir Navab, Technical University of Munich; Gregory Hager, Johns Hopkins University\",\r\n3516,Oral,Egocentric Activity Recognition on a Budget,,\"Rafael Possas, University of Sydney; Sheila Maricela Pinto Caceres, University of Sydney; Fabio Ramos, University of Sydney\",\r\n3536,Spotlight,Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images,,\"Tribhuvanesh Orekondy, MPI-INF; Mario Fritz, MPI, Saarbrucken, Germany; Bernt Schiele, MPI Informatics Germany\",\r\n3595,Oral,Efficient Optimization for Rank-based Loss Functions,,\"Pritish Mohapatra, IIIT Hyderabad; Michal Rolinek, Max Planck Institute for Intelligent Systems, Tuebingen; C.V. Jawahar, IIIT Hyderabad; Vladimir Kolmogorov, Institute of Science and Technology, Austria; M. Pawan Kumar,\",\r\n3623,Oral,MakeupGAN: Makeup Transfer via Cycle-Consistent Adversarial Networks,,\"Huiwen Chang, ; Jingwan Lu, Adobe Research; Fisher Yu, UC Berkeley; Adam Finkelstein, Princeton University\",\r\n3626,Oral,Revisiting Deep Intrinsic Image Decompositions,,\"Qingnan Fan, Shandong University; David Wipf, Microsoft Research Asia; Jiaolong Yang, Microsoft Research Asia; Gang Hua, Microsoft Research; Baoquan Chen,\",\r\n3673,Spotlight,Learning Deep Descriptors with Scale-Aware Triplet Networks,,\"Michel Keller, ETH Z_ich; Zetao Chen, ETH Zurich; Fabiola Maffra, ETH Z_ich; Patrik Schmuck, ETH Zurich; Margarita Chli, ETH Zurich\",\r\n3723,Oral,Multi-Cell Classification by Convolutional Dictionary Learning with Class Proportion Priors,,\"Florence Yellin, Johns Hopkins University; Benjamin Haeffele, Johns Hopkins University; Rene Vidal, Johns Hopkins University\",\r\n3771,Spotlight,Art of singular vectors and universal adversarial perturbations,,\"Valentin Khrulkov, Skoltech; Ivan Oseledets, Skoltech\",\r\n3791,Spotlight,\"A Low Power, High Throughput, Fully Event-Based Stereo System\",,\"Alexander Andreopoulos, IBM Research; Hirak Kashyap, UC Irvine and IBM; Tapan Nayak, IBM; Arnon Amir, IBM; Myron Flickner, IBM\",\r\n3807,Spotlight,Jointly Localizing and Describing Events for Dense Video Captioning,,\"Yehao Li, Sun Yat-Sen University; Ting Yao, Microsoft Research Asia; Yingwei Pan, University of Science and Technology of China; Hongyang Chao, Sun Yat-sen University; Tao Mei, Microsoft Research Asia\",\r\n3815,Spotlight,Customized Image Narrative Generation via Interactive Visual Question Generation and Answering,,Andr,\r\n3817,Spotlight,Good Appearance Features for Multi-Target Multi-Camera Tracking,,\"Ergys Ristani, Duke University; Carlo Tomasi, Duke University\",\r\n3916,Spotlight,Unsupervised Person Image Synthesis in Arbitrary Poses,,\"Albert Pumarola, IRI (CSIC-UPC); Antonio Agudo, IRI (CSIC-UPC); Alberto Sanfeliu, IRI (CSIC-UPC); Francesc Moreno-Noguer, Institut de Robotica i Informatica Industrial (UPC/CSIC)\",\r\n3966,Oral,MapNet: An Allocentric Spatial Memory for Mapping Environments,,\"Joao Henriques, ; Andrea Vedaldi, U Oxford\",\r\n3968,Oral,A Globally Optimal Solution to the Non-Minimal Relative Pose Problem,,\"Jesus Briales, University of Malaga; Laurent Kneip, ; Javier Gonzalez-Jimenez,\",\r\n3971,Spotlight,Unsupervised Learning and Segmentation of Complex Activities from Video,,\"Fadime Sener, University of Bonn; Angela Yao, University of Bonn\",\r\n3977,Spotlight,Inferring Light Fields from Shadows,,\"Manel Baradad, MIT; Vickie Ye, MIT; Adam Yedida, MIT; Fredo Durand, ; William Freeman, MIT/Google; Gregory Wornell, ; Antonio Torralba, MIT\",\r\n4057,Spotlight,Captioning Images with Style Transfer from Unaligned Text Corpora,,\"Alexander Mathews, Australian National University; Xuming He, ShanghaiTech; Lexing Xie, Australian National University, Data61\",\r\n4217,Spotlight,Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks,,\"Ruth Fong, University of Oxford; Andrea Vedaldi, U Oxford\",\r\n4304,Spotlight,Unsupervised Sparse Dirichlet-Net for Hyperspectral Image Super-Resolution,,\"Ying Qu, The University of Tennessee; Hairong Qi, University of Tennessee; Chiman Kwan,\",\r\n4307,Oral,A Volumetric Descriptive Network for 3D Object Synthesis,,\"Jianwen Xie, UCLA; Zilong Zheng, ucla\",\r\n7,Poster,Video Captioning via Hierarchical Reinforcement Learning,,\"Xin Wang, UCSB; Wenhu Chen, ; Jiawei Wu, UCSB; Yuan-Fang Wang, UCSB; William Yang Wang, UCSB\",\r\n24,Poster,Rethinking the Faster R-CNN Architecture for Temporal Action Localization,,\"Yu-Wei Chao, University of Michigan; Sudheendra Vijayanarasimhan, Google Research; Bryan Seybold, Google Research; David Ross, Google Research; Jia Deng, ; Rahul Sukthankar, Google Research\",\r\n34,Poster,A High-Quality Denoising Dataset for Smartphone Cameras,,\"Abdelrahman Abdelhamed, York University; Stephen Lin, Microsoft Research Asia, China; Michael Brown, York University\",\r\n35,Poster,Improving Color Reproduction Accuracy in the Camera Imaging Pipeline,,\"Hakki Karaimer, York University; Michael Brown, York University\",\r\n41,Poster,pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment,,\"Je Hyeong Hong, University of Cambridge; Christopher Zach, Toshiba Research\",\r\n64,Poster,Towards Open-Set Identity Preserving Face Synthesis,,\"Jianmin Bao, USTC; Dong Chen, Microsoft Research Asia; Fang Wen, ; Houqiang Li, ; Gang Hua, Microsoft Research\",\r\n67,Poster,Baseline Desensitizing In Translation Averaging,,\"Bingbing Zhuang, National University of Singapore; Loong Fah Cheong, National University of Singapore; Gim Hee Lee, National University of SIngapore\",\r\n68,Poster,Learning from the Deep: A Revised Underwater Image Formation Model,,\"Derya Akkaynak, University of Haifa; Tali Treibitz, University of Haifa\",\r\n77,Poster,Deep Texture Manifold for Ground Terrain Recognition,,\"Jia Xue, Rutgers; Hang Zhang, Rutgers University; Kristin Dana,\",\r\n83,Poster,DS*: Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems,,\"Florian Bernard, ; Christian Theobalt, MPI Informatics; Michael Moeller, University of Siegen\",\r\n85,Poster,\"Sparse, Smart Contours to Represent and Edit Images\",,\"Tali Dekel, Google; Dilip Krishnan, Google; Chuang Gan, Tsinghua University; Ce Liu, Google, Cambridge, USA; William Freeman, Google\",\r\n92,Poster,Every Smile is Unique: Landmark-guided Diverse Smile Generation,,\"Wei Wang, University of Trento; Xavier Alameda-Pineda, University of Trento; Dan Xu, ; Elisa Ricci, U. Perugia; Nicu Sebe, University of Trento\",\r\n95,Poster,Generative Non-Rigid Shape Completion with Graph Convolutional Autoencoders,,\"Or Litany, Tel Aviv University; Alex Bronstein, ; Michael Bronstein, ; Ameesh Makadia, Google Research\",\r\n97,Poster,Learning a Discriminative Prior for Blind Image Deblurring,,\"Lerenhan Li, HUST; Jinshan Pan, UC Merced; Wei-Sheng Lai, University of California, Merced; Changxin Gao, HUST; Nong Sang, ; Ming-Hsuan Yang, UC Merced\",\r\n100,Poster,Attentional ShapeContextNet for Point Cloud Recognition,,\"Saining Xie, UCSD; Sainan Liu, UCSD; Zeyu Chen, UCSD; Zhuowen Tu, UCSD, USA\",\r\n102,Poster,Learning Superpixels with Segmentation-Aware Affinity Loss,,\"Wei-Chih Tu, National Taiwan University; Ming-Yu Liu, NVIDIA; Varun Jampani, NVIDIA Research; Deqing Sun, NVIDIA; Shao-Yi Chien, National Taiwan University; Ming-Hsuan Yang, UC Merced; Jan Kautz, NVIDIA\",\r\n109,Poster,MegaDepth: Learning Single-View Depth Prediction from Internet Photos,,\"Zhengqi Li, Cornell University; Noah Snavely, Cornell University / Google\",\r\n121,Poster,Instance Embedding Transfer to Unsupervised Video Object Segmentation,,\"Siyang Li, USC; Bryan Seybold, Google Research; Alexey Vorobyov, Google Inc.; Alireza Fathi, Stanford University; Qin Huang, University of Southern California; C.-C. Jay Kuo, University of Southern California\",\r\n122,Poster,Detect-and-Track: Efficient Pose Estimation in Videos,,\"Rohit Girdhar, CMU; Georgia Gkioxari, Facebook; Lorenzo Torresani, Darthmout College, USA; Manohar Paluri, ; Du Tran, Dartmouth College\",\r\n124,Poster,Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval,,\"Chao Li, Xidian University; Cheng Deng, Xidian University; Ning Li, Xidian University; Wei Liu, ; Dacheng Tao, University of Sydney; Xinbo Gao,\",\r\n125,Poster,Guided Proofreading of Automatic Segmentations for Connectomics,,\"Daniel Haehn, Harvard University; Verena Kaynig, ; James Tompkin, Brown University; Jeff Lichtman, Harvard University; Hanspeter Pfister, Harvard University\",\r\n130,Poster,Context-aware Synthesis for Video Frame Interpolation,,\"Simon Niklaus, Portland State University; Feng Liu, Portland State University\",\r\n131,Poster,2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning,,\"Diogo Luvizon, ETIS Lab; David Picard, ETIS /LIP6; Hedi Tabia, ETIS / ENSEA\",\r\n135,Poster,NAG: Network for Adversary Generation,,\"Konda Reddy Mopuri, Indian Institute of Science; Utkarsh Ojha, MNNIT Allahabad; Utsav Garg, Nanyang Technological University; Venkatesh Babu Radhakrishnan, Indian Institute of Science\",\r\n165,Poster,W2F: A Weakly-Supervised to Fully-Supervised Framework for Object Detection,,\"Yongqiang Zhang, Harbin institute of Technology/KAUST; Yancheng Bai, Kaust/Iscas; Mingli Ding, ; Yongqiang Li, ; Bernard Ghanem,\",\r\n170,Poster,BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning,,\"Ziming Zhang, MERL; Yuanwei Wu, University of Kansas; Guanghui Wang, University of Kansas\",\r\n179,Poster,3D Object Detection with Latent Support Surfaces,,\"Zhile Ren, Brown University; Erik Sudderth, UC Irvine\",\r\n186,Poster,Visual Grounding via Accumulated Attention,,\"chaorui Deng, ; Qi Wu, University of Adelaide; Fuyuan Hu, ; Fan Lyu, Suzhou University of Science and Technology; Mingkui Tan, South China University of Technology; Qingyao Wu, School of Software Engineering, South China University of Technology\",\r\n191,Poster,Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors,,\"Xuanyi Dong, UTS; Shoou-I Yu, Oculus; Xinshuo Weng, Carnegie Mellon University; Shih-En Wei, Oculus Research; Yi Yang, ; Yaser Sheikh,\",\r\n195,Poster,ISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing,,\"Jian Zhang, KAUST; Bernard Ghanem,\",\r\n200,Poster,Perturbative Neural Networks: Rethinking Convolution in CNNs,,\"Felix Juefei-Xu, Carnegie Mellon University; Vishnu Naresh Boddeti, Michigan State University; Marios Savvides, Carnegie Mellon University\",\r\n216,Poster,Towards Pose Invariant Face Recognition in the Wild,,\"Jian Zhao, NUS; Yu Cheng, Nanyang Technological University; Yan Xu, Core Technology Group, Learning & Vision, Panasonic R&D Center Singapore; Lin Xiong, Core Technology Group, Learning & Vision, Panasonic R&D Center Singapore; Jianshu Li, National University of Singapo; Fang Zhao, National University of Singapore; Karlekar Jayashree, Core Technology Group, Learning & Vision, Panasonic R&D Center Singapore; Sugiri Pranata, Core Technology Group, Learning & Vision, Panasonic R&D Center Singapore; Shengmei Shen, Core Technology Group, Learning & Vision, Panasonic R&D Center Singapore; Junliang Xing, Institute of Automation, Chinese Academy of Sciences; Shuicheng Yan, National University of Singapore; Jiashi Feng,\",\r\n224,Poster,MoNet: Deep Motion Exploitation for Video Object Segmentation,,\"Huaxin Xiao, Nudt; Jiashi Feng, ; Guosheng Lin, Nanyang Technological Universi; Yu Liu, NUDT; Maojun Zhang,\",\r\n229,Poster,Exploring Disentangled Feature Representation Beyond Face Identification,,\"Yu Liu, CUHK; Fangyin Wei, Peking University; Jing Shao, The Sensetime Group Limited; Lu Sheng, The Chinese University of HK; Junjie Yan, ; Xiaogang Wang, Chinese University of Hong Kong\",\r\n232,Poster,Towards Effective Low-bitwidth Convolutional Neural Networks,,\"Bohan Zhuang, The University of Adelaide; Chunhua Shen, University of Adelaide; Mingkui Tan, South China University of Technology; Lingqiao Liu, University of Adelaide; Ian Reid,\",\r\n234,Poster,Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries,,\"Bohan Zhuang, The University of Adelaide; Qi Wu, University of Adelaide; Chunhua Shen, University of Adelaide; Ian Reid, ; Anton Van den Hengel, University of Adelaide\",\r\n237,Poster,Learning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering,,\"Kaili Zhao, Beijing University of Post & T; Wen-Sheng Chu, Carnegie Mellon University; Aleix Martinez, The ohio state university\",\r\n246,Poster,Single-Shot Object Detection with Enriched Semantics,,\"Zhishuai Zhang, Johns Hopkins University; Siyuan Qiao, Johns Hopkins University; Cihang Xie, JHU; Wei Shen, Shanghai University; Bo Wang, HikVision USA Inc.; Alan Yuille, JHU\",\r\n250,Poster,Unifying Identification and Context Learning for Person Recognition,,\"Qingqiu Huang, CUHK; Yu Xiong, CUHK; Dahua Lin, CUHK\",\r\n252,Poster,Separating Self-Expression and Visual Content in Hashtag Supervision,,\"Andreas Veit, Cornel Tech ; Maximillian Nickel, ; Serge Belongie, ; Laurens van der Maaten, Facebook\",\r\n260,Poster,Beyond Trade-off: Accelerate FCN-based Face Detection with Higher Accuracy,,\"Guanglu Song, Beihang University; Yu Liu, CUHK; Ming Jiang, BUAA; Yujie Wang, Beihang university\",\r\n261,Poster,On the Robustness of Semantic Segmentation Models to Adversarial Attacks,,\"Anurag Arnab, University of Oxford; Ondrej Miksik, University of Oxford; Phil Torr, Oxford\",\r\n285,Poster,Improved Human Pose Estimation through Adversarial Data Augmentation,,\"Zhiqiang Tang, Rutgers; Xi Peng, ; Fei Yang, facebook; Rogerio Feris, IBM; Dimitris Metaxas, Rutgers\",\r\n289,Poster,Generative Adversarial Learning Towards Fast Weakly Supervised Detection,,\"Yunhang Shen, Xiamen University; Rongrong Ji, ; Shengchuan Zhang, ; Wangmeng Zuo, Harbin Institute of Technology; Yan Wang, Microsoft\",\r\n310,Poster,Soccer on Your Tabletop,,\"Konstantinos Rematas, University of Washington; Ira Kemelmacher, ; Brian Curless, Washington; Steve Seitz, Washington/Google\",\r\n312,Poster,Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification,,\"Shuang Li, The Chinese University of HK; Slawomir Bak, Disney Research; Peter Carr, Disney Research\",\r\n313,Poster,HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN,,\"Yue Cao, Tsinghua University; Mingsheng Long, Tsinghua University; Bin Liu, Tsinghua University; Jianmin Wang,\",\r\n329,Poster,Video Representation Learning Using Discriminative Pooling,,\"Jue Wang, ANU; Anoop Cherian, ; Fatih Porikli, NICTA, Australia; Stephen Gould, Australian National University\",\r\n330,Poster,Attend and Interact: Higher-Order Object Interactions for Video Understanding,,\"CHIH-YAO MA, GEORGIA TECH; Asim Kadav, NEC Labs; Iain Melvin, ; Zsolt Kira, ; Ghassan AlRegib, ; Hans Peter Graf,\",\r\n342,Poster,Human Pose Estimation with Parsing Induced Learner,,\"Xuecheng Nie, National University of Singapo; Jiashi Feng, ; Yiming Zuo, Tsinghua University; Shuicheng Yan,\",\r\n345,Poster,4D Human Body Correspondences from Panoramic Depth Maps,,\"Zhong Li, University of Delaware; Minye Wu, ShanghaiTech; Wangyiteng Zhou, ShanghaiTech University; Jingyi Yu, University of Delaware, USA\",\r\n346,Poster,Recognizing Human Actions as Evolution of Pose Estimation Maps,,\"Mengyuan Liu, Nanyang Technological University; Junsong Yuan,\",\r\n348,Poster,GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning,,\"Yueqi Duan, Tsinghua University; Ziwei Wang, Tsinghua University; Jiwen Lu, Tsinghua University; Xudong Lin, Tsinghua University; Jie Zhou,\",\r\n362,Poster,Graph-Cut RANSAC,,\"Daniel Barath, MTA SZTAKI; Jiri Matas,\",\r\n363,Poster,Five-point Fundamental Matrix Estimation for Uncalibrated Cameras,,\"Daniel Barath, MTA SZTAKI\",\r\n368,Poster,Optimizing Local Feature Descriptors for Nearest Neighbor Matching,,\"Kun He, Boston University; Yan Lu, ; Stan Sclaroff, Boston University\",\r\n391,Poster,Motion-Guided Cascaded Refinement Network for Video Object Segmentation,,\"Ping Hu, ; Gang Wang, ; Xiangfei Kong, Nanyang Technological University; Jason Kuen, NTU, Singapore; Yap-Peng Tan,\",\r\n397,Poster,Zigzag Learning for Weakly Supervised Object Detection,,\"Xiaopeng Zhang, National University of Singapore; Jiashi Feng, ; Hongkai Xiong, Shanghai Jiao Tong University; Qi Tian,\",\r\n408,Poster,Cross-Domain Self-supervised Multi-task Feature Learning Using Synthetic Game Imagery,,\"Zhongzheng Ren, UC Davis; Yong Jae Lee, UC Davis\",\r\n419,Poster,Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks,,\"Jason Kuen, NTU, Singapore; Xiangfei Kong, Nanyang Technological University; Zhe Lin, Adobe Systems, Inc.; Gang Wang, ; Jianxiong Yin, NVIDIA; Simon See, NVIDIA; Yap-Peng Tan,\",\r\n421,Poster,Deep End-to-End Time-of-Flight Imaging,,\"Shuochen Su, University of British Columbia; Felix Heide, Stanford University; Gordon Wetzstein, ; Wolfgang Heidrich,\",\r\n430,Poster,Face Aging with Identity-Preserved Conditional Generative Adversarial Networks,,\"Zongwei WANG, ; Xu Tang, ; Weixin Luo, Shanghaitech University; Shenghua Gao, ShanghaiTech University\",\r\n449,Poster,Classification Driven Dynamic Image Enhancement,,\"Vivek Sharma, Karlsruhe Institute of Technology; Ali Diba, ; Davy Neven, KU Leuven; Michael Brown, York University; Luc Van Gool, KTH; Rainer Stiefelhagen, Karlsruhe Institute of Technology\",\r\n460,Poster,Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification,,\"Jianlou Si, BUPT; Honggang Zhang, ; Chun-Guang Li, Beijing Univ. of Posts&Telecom; Jason Kuen, NTU, Singapore; Xiangfei Kong, Nanyang Technological University; Alex Kot, ; Gang Wang,\",\r\n468,Poster,Multi-Level Factorisation Net for Person Re-Identification,,\"Xiaobin Chang, Queen Mary Univ. of London; Timothy Hospedales, University of Edinburgh; Tao Xiang, Queen Mary University of London\",\r\n491,Poster,Reconstructing Thin Structures of Manifold Surfaces by Integrating Spatial Curves,,\"Shiwei Li, HKUST; Yao Yao, HKUST; Tian Fang, HKUST; Long Quan, The Hong Kong University of Science and Technology, Hong Kong\",\r\n494,Poster,Anticipating Traffic Accidents with Adaptive Loss and Large-scale Incident DB,,\"Tomoyuki Suzuki, Keio University; Hirokatsu Kataoka, AIST; Yoshimitsu Aoki, Keio University; Yutaka Satoh, AIST\",\r\n505,Poster,Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification,,\"Wenguan Wang, Beijing Institute of Technology; Yuanlu Xu, University of California, Los Angeles; Jianbing Shen, Beijing Institute of Technolog; Song-Chun Zhu,\",\r\n506,Poster,Unsupervised CCA,,\"Yedid Hoshen, Facebook AI Research (FAIR); Lior Wolf, Tel Aviv University, Israel\",\r\n510,Poster,Discovering Point Lights with Intensity Distance Fields,,\"Edward Zhang, University of Washington; MIchael Cohen, ; Brian Curless, Washington\",\r\n512,Poster,Universal Denoising Networks : A Novel CNN-based Network Architecture for Image Denoising,,\"Stamatios Lefkimmiatis, Skolkovo Institute of Science\",\r\n517,Poster,Easy Identification from Better Constraints: Multi-Shot Person Re-Identification from Reference Constraints,,\"Jiahuan Zhou, Northwestern University; Bing Su, Chinese Academy of Sciences; Ying Wu, Northwestern University, USA\",\r\n534,Poster,Recurrent Scene Parsing with Perspective Understanding in the Loop,,\"Shu Kong, University of California, Irvine; Charless Fowlkes, University of California, Irvine, USA\",\r\n540,Poster,Learning to Hash by Discrepancy Minimization,,\"Zhixiang Chen, Tsinghua University; Xin Yuan, Tsinghua University; Jiwen Lu, Tsinghua University; Jie Zhou,\",\r\n550,Poster,Disentangling Structure and Aesthetics for Content-aware Image Completion,,\"Andrew Gilbert, University of Surrey; John Collomosse, University of Surrey, UK.; Hailin Jin, ; Brian Price,\",\r\n569,Poster,The Best of Both Worlds: Combining CNNs and Geometric Constraints for Hierarchical Motion Segmentation,,\"Pia Bideau, University of Massachusets; Aruni RoyChowdhury, University of Massachusetts; Rakesh Radhakrishnan Menon, University of Massachusetts; Erik Miller,\",\r\n588,Poster,Coupled End-to-end Transfer Learning with Generalized Fisher Information,,\"Shixing Chen, Wayne State University; Caojin Zhang, Wayne State University; Ming Dong,\",\r\n589,Poster,Knowledge Aided Consistency for Weakly Supervised Phrase Grounding,,\"Kan Chen, Univ. of Southern California; Jiyang Gao, ; Ram Nevatia,\",\r\n593,Poster,Viewpoint-aware Attentive Multi-view Inference for Vehicle Re-identification,,\"Yi Zhou, University of East Anglia; Ling Shao, University of East Anglia\",\r\n594,Poster,MatNet: Modular Attention Network for Referring Expression Comprehension,,\"Licheng Yu, UNC Chapel Hill; Zhe Lin, Adobe Systems, Inc.; Xiaohui Shen, Adobe Research; Jimei Yang, ; Xin Lu, ; Mohit Bansal, UNC Chapel Hill; Tamara Berg, University on North carolina\",\r\n615,Poster,Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision,,\"Yaojie Liu, Michigan State University; Amin Jourabloo, ; Xiaoming Liu, Michigan State University\",\r\n619,Poster,A Memory Network Approach for Story-based Temporal Summarization of 360 Videos,,\"Sangho Lee, Seoul National University; Jinyoung Sung, Seoul National University; Youngjae Yu, ; Gunhee Kim, Carnegie Mellon University\",\r\n620,Poster,Improving Occlusion and Hard Negative Handling for Single-Stage Object Detectors,,\"Junhyug Noh, Seoul National University; Soochan Lee, ; Beomsu Kim, ; Gunhee Kim, Carnegie Mellon University\",\r\n623,Poster,UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition,,\"Jiankang Deng, Imperial College London; Shiyang Cheng, Imperial College London; Niannan Xue, Imperial College London; Yuxiang Zhou, Imperial College; Stefanos Zafeiriou, Imperial College London\",\r\n631,Poster,Learning to Act Properly: Predicting and Explaining Affordances from Images,,\"Ching-Yao Chuang, University of Toronto; Jiaman Li, University of Toronto; Antonio Torralba, MIT; Sanja Fidler,\",\r\n648,Poster,Seeing Temporal Modulation of Lights from Standard Cameras,,\"Naoki Sakakibara, Nagoya Institute of Technology; Fumihiko Sakaue, Nagoya Institute of Technology; JUN SATO, Nagoya Institute of Technology\",\r\n649,Poster,Point-wise Convolutional Neural Networks,,\"Binh-Son Hua, SUTD; Khoi Tran, SUTD; Sai-Kit Yeung,\",\r\n672,Poster,Missing Slice Recovery for Tensors Using a Low-rank Model in Embedded Space,,\"Tatsuya Yokota, Nagoya Institute of Technology; Burak Erem, ; Seyhmus Guler, ; Simon Warfield, Harvard Medical School; Hidekata Hontani,\",\r\n673,Poster,Learning Convolutional Networks for Content-weighted Image Compression,,\"Mu LI, PolyU; Wangmeng Zuo, Harbin Institute of Technology; Shuhang Gu, ; debin Zhao, ; David Zhang, Hong Kong Polytechnic University\",\r\n680,Poster,Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation,,\"Kai Li, Chinese Academy of Sciences; Junliang Xing, Institute of Automation, Chinese Academy of Sciences; Chi Su, KingSoft; Weiming Hu, ; Yundong Zhang, Vimicro Corporation; Stephen Maybank, Birkbeck University of London\",\r\n700,Poster,Cube Padding for Weakly-Supervised Saliency Prediction in 360$^{\\circ}$ Videos,,\"Hsien-Tzu Cheng, National Tsing Hua University; Chun-Hung Chao, ; Jin-Dong Dong, ; Hao-Kai Wen, ; Tyng-Luh Liu, IIS/Academia Sinica; Min Sun, University of Washington\",\r\n710,Poster,A Face to Face Neural Conversation Model,,\"Hang Chu, University of Toronto; Sanja Fidler,\",\r\n717,Poster,Dynamic Video Segmentation Network,,\"Yu-Shuan Xu, National Tsing Hua University; Chun-Yi Lee, National Tsing Hua University; TSUJUI FU, NTHUCS; HsuanKung Yang, National Tsing Hua University\",\r\n721,Poster,Multiple Granularity Group Interaction Prediction,,\"Taiping Yao, Shanghai Jiaotong University; Minsi Wang, Shanghai Jiao Tong University; Huawei Wei, Shanghai Jiao Tong University; Bingbing Ni, ; Xiaokang Yang,\",\r\n733,Poster,From Lifestyle VLOGs to Everyday Interactions,,\"David Fouhey, UC Berkeley; WEICHENG KUO, Berkeley; Alexei Efros, UC Berkeley; Jitendra Malik,\",\r\n757,Poster,\"Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene\",,\"Shubham Tulsiani, UC Berkeley; David Fouhey, UC Berkeley; Saurabh Gupta, ; Alexei Efros, UC Berkeley; Jitendra Malik,\",\r\n759,Poster,Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction,,\"Shubham Tulsiani, UC Berkeley; Alexei Efros, UC Berkeley; Jitendra Malik,\",\r\n762,Poster,A Fast Resection-Intersection Method for the Known Rotation Problem,,\"Qianggong Zhang, The University of Adelaide; Tat-Jun Chin, ; Huu Le, The University of Adelaide\",\r\n784,Poster,Kernelized Subspace Pooling for Deep Local Descriptors,,\"Xing Wei, Xi'an Jiaotong University; Yihong Gong, Xi'an Jiaotong University; Yue Zhang, IAIR,Xi'an Jiaotong University; Nanning Zheng, Xi'an Jiaotong University\",\r\n786,Poster,Video Rain Removal By Multiscale Convolutional Sparse Coding,,\"Li Minghan, Xi'an Jiaotong University; Qi Xie, ; Qian Zhao, ; Wei Wei, Xi'an Jiaotong University; Shuhang Gu, ; Jing Tao, ; Deyu Meng, Xi'an Jiaotong University\",\r\n792,Poster,Referring Relationships,,\"Ranjay Krishna, Stanford University; Ines Chami, Stanford University; Michael Bernstein, Stanford University; Fei-Fei Li, Stanford University\",\r\n816,Poster,Learning Dual Convolutional Neural Networks for Low-Level Vision,,\"Jinshan Pan, UC Merced; Sifei Liu, ; Deqing Sun, NVIDIA; Jiawei Zhang, City University of Hong Kong; Yang Liu, DUT; Jimmy Ren, SenseTime Group Limited; Zechao Li, Nanjing University of Science and Technology ; Jinhui Tang, ; Huchuan Lu, Dalian University of Technology; Yu-Wing Tai, Tencent YouTu; Ming-Hsuan Yang, UC Merced\",\r\n849,Poster,Semantic Visual Localization,,\"Johannes Sch_berger, ETH Zurich; Marc Pollefeys, ETH; Andreas Geiger, MPI Tuebingen / ETH Zuerich; Torsten Sattler, ETH Zurich\",\r\n851,Poster,Joint Cuts and Matching of Partitions in One Graph,,\"Tianshu Yu, Arizona State University; Junchi Yan, Shanghai Jiao Tong University; Jieyi Zhao, University of Texas Health Science Center at Houston; Baoxin Li, Arizona State University\",\r\n888,Poster,Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification,,\"Jingya Wang, QMUL; Xiatian Zhu, Vision Semantics Ltd.; Shaogang Gong, Queen Mary University; Wei Li, Queen Mary University of Lond\",\r\n890,Poster,CarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles,,\"Dinesh reddy Narapureddy, Carnegie mellon university; Minh Vo, CMU; Srinivasa Narasimhan, Carnegie Mellon University\",\r\n894,Poster,Deep Material-aware Cross-spectral Stereo Matching,,\"Tiancheng Zhi, Carnegie Mellon University; Bernardo Pires, CMU; Martial Hebert, Carnegie Mellon University; Srinivasa Narasimhan, Carnegie Mellon University\",\r\n908,Poster,Harmonious Attention Network for Person Re-Identi\u001acation,,\"Wei Li, Queen Mary University of Lond; Xiatian Zhu, Vision Semantics Ltd.; Shaogang Gong, Queen Mary University\",\r\n910,Poster,Pseudo-Mask Augmented Object Detection,,\"Xiangyun Zhao, Northwestern University; Shuang Liang, Tongji University; Yichen Wei, Microsoft Research Asia\",\r\n944,Poster,Monocular Relative Depth Perception with Web Stereo Data Supervision,,\"Ke Xian, Huazhong University of Science and Technology; Chunhua Shen, University of Adelaide; Zhiguo Cao, Huazhong University of Science and Technology; Hao Lu, Huazhong University of Science and Technology; yang xiao, Huazhong University of Science and Technology; Ruibo Li, Huazhong University of Science and Technology; Zhenbo Luo, Samsung Research Beijing\",\r\n952,Poster,Objects as context for detecting their semantic parts,,\"Abel Gonzalez-Garcia, University of Edinburgh; Davide Modolo, Amazon; Vitto Ferrari,\",\r\n961,Poster,Conditional Generative Adversarial Network for Structured Domain Adaptation,,\"Weixiang Hong, Nanyang Technological Universi; Zhenzhen Wang, Nanyang Technological University; Ming Yang, Horizon Robotics Inc.; Junsong Yuan, Nanyang Technological University\",\r\n962,Poster,Rotation-sensitive Regression for Oriented Scene Text Detection,,\"Minghui Liao, Huazhong University of Science and Technology; Zhen Zhu, Huazhong University of Science and Technology; Baoguang Shi, Huazhong University of Science and Technology; Gui-Song Xia, Wuhan University; Xiang Bai, Huazhong University of Science and Technology\",\r\n963,Poster,Residual Parameter Transfer for Deep Domain Adaptation,,\"Artem Rozantsev, EPFL; Mathieu Salzmann, EPFL; Pascal Fua,\",\r\n978,Poster,Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network,,\"Daniel Merget, Technical University of Munich; Matthias Rock, TUM; Rigoll Gerhard, TUM\",\r\n1001,Poster,Feature Quantization for Defending Against Distortion of Images,,\"Zhun Sun, Tohoku University; Mete Ozay, ; Yan Zhang, RIKEN Center for AIP ; Xing Liu, Tohoku University; Takayuki Okatani, Tohoku University/RIKEN AIP\",\r\n1019,Poster,Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts,,\"Yizhe Zhu, ; Mohamed Elhoseiny, FAIR; Bingchen Liu, Rutgers; Ahmed Elgammal,\",\r\n1020,Poster,A Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds,,\"Tolga Birdal, Technical University of Munich; Benjamin Busam, Framos; Nassir Navab, Technical University of Munich; Slobodan Ilic, Siemens AG; Peter Sturm, INRIA Rhone-Alpes\",\r\n1022,Poster,A Benchmark for Articulated Human Pose Estimation and Tracking,,\"Mykhaylo Andriluka, MPI Informatics; Umar Iqbal, ; Eldar Insafutdinov, MPI Informatics; Anton Milan, University of Adelaide; Leonid Pishchulin, MPI Informatik; Juergen Gall, University of Bonn, Germany; Bernt Schiele, MPI Informatics Germany\",\r\n1035,Poster,Super-Resolving Very Low-Resolution Face Images with Supplementary Attributes,,\"Xin Yu, Australian National University; Basura Fernando, ANU Canberra Australia; Richard Hartley, Australian National University Australia; Fatih Porikli, NICTA, Australia\",\r\n1036,Poster,Video Person Re-identification with Competitive Snippet-similarity Aggregation and Co-attentive Snippet Embedding,,\"Dapeng Chen, CUHK; Hongsheng Li, ; Tong Xiao, The Chinese University of HK; Shuai Yi, The Chinese University of Hong Kong; Xiaogang Wang, Chinese University of Hong Kong\",\r\n1037,Poster,One-shot Action Localization by Sequence Matching Network,,\"Hongtao Yang, Australian National University; Xuming He, ShanghaiTech; Fatih Porikli, NICTA, Australia\",\r\n1052,Poster,Efficient Subpixel Refinement with Symbolic Linear Predictors,,\"Vincent Lui, Monash University; Jonathon Geeves, Monash University; Winston Yii, Monash University; Tom Drummond, Monash\",\r\n1056,Poster,Distort-and-Recover: Color Enhancement using Deep Reinforcement Learning,,\"Jongchan Park, KAIST; Joon-Young Lee, ; Donggeun Yoo, Lunit; In So Kweon, KAIST\",\r\n1058,Poster,Single Image Reflection Separation with Perceptual Losses,,\"Xuaner Zhang, UC Berkeley; Qifeng Chen, Intel Labs\",\r\n1067,Poster,Recognize Actions by Disentangling Components of Dynamics,,\"Yue Zhao, CUHK; Yuanjun Xiong, Amazon ; Dahua Lin, CUHK\",\r\n1078,Poster,Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains,,\"Jiahao Pang, SenseTime Group Limited; Wenxiu Sun, SenseTime Group Limited; Chengxi Yang, SenseTime Group Limited; Jimmy Ren, SenseTime Group Limited; Ruichao Xiao, ; Jin Zeng, The Hong Kong University of Science and Technology; Liang Lin,\",\r\n1082,Poster,Attention-aware Compositional Network for Person Re-Identification,,\"Jing Xu, SenseNets Technology Limited; Rui Zhao, SenseNets Technology Limited; Feng Zhu, SenseNets Technology Limited; Huaming Wang, SenseNets Technology Limited; Wanli Ouyang, The University of Sydney\",\r\n1083,Poster,HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification,,\"Amos Sironi, Prophesee; Manuele Brambilla, Prophesee; Nicolas Bourdis, prophesee; Xavier Lagorce, Prophesee; Ryad Benosman, Universite Pierre et Marie Curie-Paris\",\r\n1085,Poster,Mask-guided Contrastive Attention Model for Person Re-Identification,,\"Chunfeng Song, CASIA; Yan Huang, ; Wanli Ouyang, ; Liang Wang, unknown\",\r\n1103,Poster,A Biresolution Spectral framework for Product Quantization,,\"Lopamudra Mukherjee, University of Wisc Whitewater; Sathya Ravi, University of Wisconsin-Madison; Jiming Peng, University of Houston; Vikas Singh, University of Wisconsin-Madison\",\r\n1109,Poster,Dynamic Zoom-in Network for Fast Object Detection in Large Images,,\"Mingfei Gao, University of Maryland; Ruichi Yu, ; Ang Li, Google DeepMind; Vlad Morariu, University of Maryland; Larry Davis, University of Maryland, USA\",\r\n1110,Poster,On the Importance of Label Quality for Semantic Segmentation,,\"Aleksandar Zlateski, MIT; ronnachai Jaroensri, Massachusetts Institute of Technology; Prafull Sharma, MIT; Fredo Durand,\",\r\n1113,Poster,EPINET: A Fully-Convolutional Neural Network for Light Field Depth Estimation by Using Epipolar Geometry,,\"Changha Shin, Yonsei Univ; Hae-Gon Jeon, KAIST; Youngjin Yoon , ; InSo Kweon, ; Seon Joo Kim, Yonsei University\",\r\n1118,Poster,Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos,,\"Jiaying Liu, Peking University; Wenhan Yang, Peking University; Shuai Yang, Peking University; Zongming Guo,\",\r\n1124,Poster,Scalable and Effective Deep CCA via Soft Decorrelation,,\"Xiaobin Chang, Queen Mary Univ. of London; Tao Xiang, Queen Mary University of London; Timothy Hospedales, University of Edinburgh\",\r\n1126,Poster,High-order tensor regularization with application to attribute ranking,,\"Kwang In Kim, University of Bath; Juhyun Park, Lancaster University; James Tompkin, Brown University\",\r\n1133,Poster,Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network,,\"Wenda Zhao, Dalian University of Technolog; Dong Wang, DUT; Huchuan Lu, Dalian University of Technology\",\r\n1134,Poster,Decorrelated Batch Normalization,,\"Lei Huang, BeiHang university; Dawei Yang, University of Michigan; Bo Lang, Beihang University; Jia Deng,\",\r\n1162,Poster,Low-Shot Recognition with Imprinted Weights,,\"Hang Qi, UCLA; Matthew Brown, ; David Lowe,\",\r\n1170,Poster,Facelet-Bank for Fast Portrait Manipulation,,\"Ying-Cong Chen, CUHK; Lin Huaijia, the Chinese University of Hong Kong; Ruiyu Li, CUHK; Michelle Shu, ; Xin Tao, CUHK; Yangang Ye, Tencent; Xiaoyong Shen, CUHK; Jiaya Jia, Chinese University of Hong Kong\",\r\n1173,Poster,Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation,,\"Xiaowei Xu, University of Notre Dame; Yiyu Shi, University of Notre Dame; Qing Lu, University of Notre Dame; Lin Yang, University of Notre Dame; Sharon Hu, University of Notre Dame; Danny Chen, University of Notre Dame\",\r\n1178,Poster,Structure Preserving Video Prediction,,\"Xu Jingwei, Shanghai Jiao Tong University; Bingbing Ni, ; Zefan Li, Shanghai Jiaotong University; Shuo Cheng, SJTU; Xiaokang Yang,\",\r\n1182,Poster,Tagging Like Humans: Diverse and Distinct Image Annotation,,\"Baoyuan Wu, Tencent AI Lab; Weidong Chen, Tencent; Wei Liu, ; Peng Sun, Tencent; Bernard Ghanem, ; Siwei Lyu, SUNY Albany\",\r\n1185,Poster,Learning to Sketch with Shortcut Cycle Consistency,,\"Jifei Song, Queen Mary, Uni. of London; Kaiyue Pang, QMUL; Yi-Zhe Song, ; Tao Xiang, Queen Mary University of London; Timothy Hospedales, University of Edinburgh\",\r\n1186,Poster,GroupCap: Group-based Image Captioning with Structured Relevance and Diversity Constraints,,\"Fuhai Chen, Xiamen university; Rongrong Ji, ; Xiaoshuai Sun, Harbin Institute of Technology; Jinsong Su, Xiamen university\",\r\n1194,Poster,Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning,,\"Xingping Dong, Beijing Institute of Technology; Jianbing Shen, Beijing Institute of Technolog; Wenguan Wang, Beijing Institute of Technology; Yu Liu, Beijing Institute of Technology; Ling Shao, University of East Anglia; Fatih Porikli, NICTA, Australia\",\r\n1213,Poster,Augmenting Crowd-Sourced 3D Reconstructions using Semantic Detections,,\"True Price, UNC Chapel Hill; Johannes Sch_berger, ETH Zurich; Zhen Wei, University of North Carolina; Marc Pollefeys, ETH; Jan-Michael Frahm, UNC Chapel Hill\",\r\n1219,Poster,Visual Relationship Learning with a Factorization-based Prior,,\"SEONG JAE HWANG, University of Wisconsin - Madison; Zirui Tao , University of Wisconsin - Madi; Vikas Singh, University of Wisconsin-Madison; Hyunwoo Kim, Amazon Lab 126; Sathya Ravi, University of Wisconsin-Madison; Maxwell Collins,\",\r\n1224,Poster,Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation,,\"Qingchao Chen, Unviersity College London; Yang Liu, University of Cambridge; Zhaowen Wang, Adobe; Ian Wassell, ; Kevin Chetty,\",\r\n1226,Poster,Flow Guided Recurrent Neural Encoder for Video Salient Object Detection,,\"Guanbin Li, ; Yuan Xie, ; Tianhao Wei, ; Liang Lin,\",\r\n1230,Poster,Disentangling 3D Pose in A Dendritic CNN for Unconstrained 2D Face Alignment,,\"Amit Kumar, University of Maryland; Rama Chellappa, University of Maryland, USA\",\r\n1244,Poster,Unsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints,,\"Reza Mahjourian, University of Texas at Austin; Martin Wicke, Google Brain; Anelia Angelova, Google Brain\",\r\n1252,Poster,Gated Fusion Network for Single Image Dehazing,,\"Wenqi Ren, Chinese Academy of Sciences; Lin Ma, Tencent AI Lab; Jiawei Zhang, City University of Hong Kong; Jinshan Pan, UC Merced; Xiaochun Cao, Chinese Academy of Sciences; Wei Liu, ; Ming-Hsuan Yang, UC Merced\",\r\n1258,Poster,Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks,,\"Ping Wei, Xi'an Jiaotong University; Yang Liu, UCLA; Tianmin Shu, University of California, Los Angeles; Nanning Zheng, Xi'an Jiaotong University; Song-Chun Zhu,\",\r\n1264,Poster,End-to-end Flow Correlation Tracking with Spatial-temporal Attention,,\"Zheng Zhu, Institute of Automation, CAS; Wei Wu, ; Wei Zou, ; Junjie Yan,\",\r\n1271,Poster,Left/Right Asymmetric Layer Skippable Networks,,\"Changmao Cheng, Fudan University; Yanwei Fu, fudan; Yu-Gang Jiang, Fudan University; Wei Liu, ; wenlian Lu, Fudan; Jianfeng Feng, fudan university; Xiangyang Xue,\",\r\n1288,Poster,Edit Probability for Scene Text Recognition,,\"Fan Bai, Fudan University; Zhanzhan Cheng, Hikvision Research Institute; Yi Niu, Hikvision Research Institute; Shiliang Pu, ; Shuigeng Zhou, Fudan University\",\r\n1290,Poster,Exploit the Unknown Gradually:~ One-Shot Video-Based Person Re-Identification by Stepwise Learning,,\"Yu Wu, University of technology sydne; Yutian Lin, ; Xuanyi Dong, UTS; Yan Yan, UTS; Wanli Ouyang, The University of Sydney; Yi Yang,\",\r\n1294,Poster,Learning to Localize Sound Source in Visual Scenes,,\"Arda Senocak, KAIST; Junsik Kim, Korea Advanced Institute of Science and Technology (KAIST); Tae-Hyun Oh, MIT; Ming-Hsuan Yang, UC Merced; In So Kweon, KAIST\",\r\n1303,Poster,Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features,,\"Xiang Wang, Tsinghua University; Shaodi You, Data61; Xi Li, Tsinghua University; Huimin Ma, Tsinghua University\",\r\n1304,Poster,SINT++: Robust Visual Tracking via Adversarial Hard Positive Generation,,\"Xiao Wang, Anhui university; Chenglong Li, Anhui University; Bin Luo, ; Jin Tang,\",\r\n1308,Poster,Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer,,\"Amir Atapour-Abarghouei, Durham University; Toby Breckon, Durham University\",\r\n1335,Poster,Imagination-IQA: No-reference Image Quality Assessment via Adversarial Learning,,\"Kwan-Yee Lin, Peking University\",\r\n1342,Poster,Memory Matching Networks for One-Shot Image Recognition,,\"Qi Cai, University of Science and Technology of China; Yingwei Pan, University of Science and Technology of China; Ting Yao, Microsoft Research Asia; Chenggang Yan, Hangzhou Dianzi University, China; Tao Mei, Microsoft Research Asia\",\r\n1343,Poster,3D Human Pose Estimation in the Wild by Adversarial Learning,,\"Wei Yang, The Chinese University of Hong Kong ; Wanli Ouyang, The University of Sydney; Xiaolong Wang, Carnegie Mellon University; Xiaogang Wang, Chinese University of Hong Kong\",\r\n1350,Poster,Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective,,\"Suryansh Kumar, Australian National University; Anoop Cherian, ; Yuchao Dai, Australian National University; Hongdong Li, Australian National University\",\r\n1360,Poster,Deep Regression Forests for Age Estimation,,\"Wei Shen, Shanghai University; Yilu Guo, Shanghai University; Yan Wang, JHU; KAI ZHAO, Nankai University; Bo Wang, HikVision USA Inc.; Alan Yuille,\",\r\n1366,Poster,A Bi-directional Message Passing Model for Salient Object Detection,,\"Lu Zhang, Dalian University of Technolog; Ju Dai, Dalian University of Technolog; Huchuan Lu, Dalian University of Technology; You He, ; Gang Wang,\",\r\n1369,Poster,Transductive Unbiased Embedding for Zero-Shot Learning,,\"Jie Song, Zhejiang University; Chengchao Shen, Zhejiang University; Yezhou Yang, Arizona State University; Yang Liu, ; Mingli Song, Zhejiang University\",\r\n1376,Poster,Scale-Transferrable Object Detection,,\"Peng Zhou, Sjtu; Bingbing Ni, ; Cong Geng, sjtu; jianguo Hu, Minivision; Yi Xu, Shanghai Jiao Tong University\",\r\n1381,Poster,Deep Cauchy Hashing for Hamming Space Retrieval,,\"Yue Cao, Tsinghua University; Mingsheng Long, Tsinghua University; Bin Liu, Tsinghua University; Jianmin Wang,\",\r\n1387,Poster,Demo2Vec: Reasoning Object Affordances from Online Videos,,\"Te-Lin Wu, USC; Kuan Fang, Stanford University; Daniel Yang, University of Southern California; Joseph Lim, University of Southern California\",\r\n1389,Poster,GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition,,\"Yifan Feng, Xidian university; Zizhao Zhang, ; xibin Zhao, ; Rongrong Ji, ; Yue Gao, Tsinghua University\",\r\n1392,Poster,Stereoscopic Neural Style Transfer,,\"Dongdong Chen, ; Lu Yuan, Microsoft Research Asia; Jing Liao, ; Nenghai Yu, ; Gang Hua, Microsoft Research\",\r\n1401,Poster,Bootstrapping the Performance of Webly Supervised Semantic Segmentation,,\"Tong Shen, The University of Adelaide; Guosheng Lin, Nanyang Technological Universi; Chunhua Shen, University of Adelaide; Ian Reid,\",\r\n1408,Poster,Learning Markov Clustering Networks for Scene Text Detection,,\"ZICHUAN LIU, Nanyang Technological Universi; Guosheng Lin, Nanyang Technological Universi; Sheng Yang, Nanyang Technological University; Jiashi Feng, ; Weisi Lin, Nanyang Technological University; Wangling Goh, Nanyang Technological University\",\r\n1428,Poster,Reflection Removal for Large-Scale 3D Point Clouds,,\"Jae-Seong Yun, UNIST; Jae-Young Sim, UNIST\",\r\n1432,Poster,Pose Transferrable Person Re-Identification,,\"Jinxian Liu, Shanghai Jiao Tong University; Yichao Yan, Shanghai Jiao Tong University; Bingbing Ni, ; Peng Zhou, Sjtu; Shuo Cheng, SJTU; jianguo Hu, Minivision\",\r\n1439,Poster,Efficient Diverse Ensemble for Discriminative Co-Tracking,,\"Kourosh Meshgi, Kyoto University; Shigeyuki Oba, Kyoto University; Shin Ishii, Kyoto University\",\r\n1443,Poster,Probabilistic Plant Modeling via Multi-View Image-to-Image Translation,,\"Takahiro Isokane, Osaka university; Fumio Okura, Osaka University; Ayaka Ide, Osaka University; Yasuyuki Matsushita, Osaka University; Yasushi Yagi, Osaka University\",\r\n1462,Poster,Towards a Mathematical Understanding of the Difficulty in Learning with Feedforward Neural Networks,,\"Hao Shen, Fortiss GmbH\",\r\n1467,Poster,Weakly-supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation,,\"Yong Zhang, CASIA; Weiming Dong, ; Bao-Gang Hu, CASIA; Qiang Ji, RPI\",\r\n1487,Poster,The power of ensembles for active learning in image classification,,\"William Beluch, Bosch Center for Artificial Intelligence; Tim Genewein, Robert Bosch Center for AI; Andreas N_nberger, Otto-von-Guericke-Universit_ Magdeburg ; Jan K_ler, Bosch Center for AI\",\r\n1493,Poster,Memory Based Online Learning of Deep Representations from Video Streams,,\"Federico Pernici, MICC University of Florence; federico Bartoli, Micc - University of Florence; Matteo Bruni, Micc - University of Florence; Alberto Del Bimbo, University of Florence\",\r\n1497,Poster,Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks,,\"Wei Xiong, University of Rochester; Wenhan Luo, Tencent AI Lab; Lin Ma, Tencent AI Lab; Wei Liu, ; Jiebo Luo, University of Rochester\",\r\n1501,Poster,Learning Discriminative Evaluation Metrics for Image Captioning,,\"Yin Cui, CornellTech; Guandao Yang, Cornell University; Andreas Veit, Cornel Tech ; Xun Huang, ; Serge Belongie,\",\r\n1508,Poster,Curve Reconstruction via the Global Statistics of Natural Curves,,\"Ehud Barnea, Ben-Gurion University; Ohad Ben-Shahar, Ben-Gurion University\",\r\n1517,Poster,LAMV: Learning to align and match videos with kernelized temporal layers,,\"Lorenzo Baraldi, University of Modena; Matthijs Douze, ; Rita Cucchiara, ; Herve Jegou, Facebook AI Research\",\r\n1522,Poster,Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment,,\"Li Ding, MIT; Chenliang Xu, University of Rochester\",\r\n1524,Poster,Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers,,\"Stephan Richter, TU Darmstadt; Stefan Roth,\",\r\n1531,Poster,Deep Semantic Face Deblurring,,\"Ziyi Shen, Beijing Institute of Technology; Wei-Sheng Lai, University of California, Merced; Tingfa Xu, Beijing Institute of Technology; Jan Kautz, NVIDIA; Ming-Hsuan Yang, UC Merced\",\r\n1539,Poster,Visual to Sound: Generating Natural Sound for Videos in the Wild,,\"Yipin Zhou, UNC-Chapel Hill; Zhaowen Wang, Adobe; Chen Fang, Adobe Research; Trung Bui, ; Tamara Berg, University on North carolina\",\r\n1543,Poster,Deep Reinforcement Learning of Region Proposal Networks for Object Detection,,\"Aleksis Pirinen, Lund University; Cristian Sminchisescu,\",\r\n1549,Poster,When will you do what? - Anticipating Temporal Occurrences of Activities,,\"Alexander Richard, University of Bonn; Juergen Gall, University of Bonn, Germany; Yazan Abu Farha, University of Bonn\",\r\n1550,Poster,Pixel-Wise Metric Learning for Blazingly Fast Video Object Segmentation,,\"Yuhua Chen, CVL@ETHZ; Jordi Pont-Tuset, ETHZ; Alberto Montes, ETHZ; Luc Van Gool, KTH\",\r\n1562,Poster,Name-removed-for-review: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection,,\"Tatjana Chavdarova, Idiap and EPFL; Pierre Baqu, EPFL; Andrii Maksai, ; ST_HANE BOUQUET, EPFL; Cijo Jose, Idiap and EPFL; Louis Lettry, ETH Z_ich; Francois Fleuret, Idiap Research Institute; Pascal Fua, ; Luc Van Gool, KTH\",\r\n1564,Poster,DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion,,\"Zhishuai Zhang, Johns Hopkins University; Cihang Xie, JHU; Jianyu Wang, ; Lingxi Xie, UCLA; Alan Yuille, JHU\",\r\n1573,Poster,Neighbors Do Help: Deeply Exploiting Local Structures of Point Clouds,,\"Yiru Shen, Clemson University; Chen Feng, MERL; Yaoqing Yang, Carnegie Mellon University; Dong Tian, Mitsubishi Electric Research Laboratories\",\r\n1575,Poster,Controllable Video Generation with Sparse Trajectories,,\"Zekun Hao, ; Xun Huang, ; Serge Belongie,\",\r\n1607,Poster,Direct Shape Regression Networks for End-to-End Face Alignment,,\"Xin Miao, UT Arlington; Xiantong Zhen, Beihang University; Vassilis Athitsos, University of Texas at Arlington; Xianglong Liu, Beihang University; Cheng Deng, Xidian University; Heng Huang, University of Pittsburgh\",\r\n1630,Poster,Joint Pose and Expression Modeling for Facial Expression Recognition,,\"Feifei Zhang, Jiangsu University; Tianzhu Zhang, CASIA; Qirong Mao, Department of Computer Science and Communication Engineering, Jiangsu University; Changsheng Xu,\",\r\n1635,Poster,Beyond Holistic Object Recognition: Enriching Image Understanding with Part States,,\"Cewu Lu, Shanghai Jiao Tong University; hao Su, ; CK Tang, HKUST\",\r\n1640,Poster,Recurrent Residual Module for Fast Inference in Videos,,\"Bowen Pan, Shanghai Jiao Tong University; Wuwei Lin, Shanghai Jiao Tong University; Xiaolin Fang, Zhejiang University; Chaoqin Huang, Shanghai Jiaotong University; Bolei Zhou, Massachuate Institute of Technology; Cewu Lu, Shanghai Jiao Tong University\",\r\n1653,Poster,Geometry-Aware Scene Text Detection with Instance Transformation Network,,\"Fangfang Wang, Zhejiang University; Liming Zhao, Zhejiang University; Xi Li, Zhejiang University; Xinchao Wang, ; Dacheng Tao, University of Sydney\",\r\n1664,Poster,Temporal Hallucinating for Action Recognition with Few Still Images,,\"Lei Zhou, ; Yali Wang, SIAT, CAS; Yu Qiao, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences\",\r\n1672,Poster,Deep Sparse Coding for Invariant Multimodal Halle Berry Neurons,,\"Edward Kim, ; Darryl Hannan, ; Garrett Kenyon,\",\r\n1679,Poster,Fusing Crowd Density Maps and Visual Object Trackers for People Tracking in Crowd Scenes,,\"Weihong Ren, City University of Hong Kong; Di Kang, ; Yandong Tang, Shenyang Institute of Automation, Chinese Academy of Sciences; Antoni Chan, City University of Hong Kong, Hong Kong\",\r\n1696,Poster,Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis,,\"Seunghoon Hong, POSTECH; Dingdong Yang, University of Michigan; Jongwook Choi, University of Michigan; Honglak Lee, University of Michigan, USA\",\r\n1704,Poster,Deep Marching Cubes: Learning Explicit Surface Representations,,\"Yiyi Liao, Zhejiang University; Simon Donn, Ghent University; Andreas Geiger, MPI Tuebingen / ETH Zuerich\",\r\n1716,Poster,A General Two-Step Quantization Approach for Low-bit Neural Networks with High Accuracy,,\"Peisong Wang, CASIA; Qinghao Hu, Chinese Academy of Sciences; Yifan Zhang, CASIA; Jian Cheng, Chinese Academy of Sciences\",\r\n1717,Poster,Clinical Skin Lesion Diagnosis using Representations Inspired by Dermatologist Criteria,,\"Jufeng Yang, Nankai University; Xiaoxiao Sun, ; Jie Liang, ; Paul Rosin,\",\r\n1720,Poster,Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition,,\"Jinmian Ye, University of Electronic Science and Technology of China; Linnan Wang, Brown; Guangxi Li, UESTC; Di Chen, ; Shandian Zhe, School of Computing, University of Utah; Zenglin Xu, University of Electronic Science and Technology of China\",\r\n1736,Poster,Deep Progressive Reinforcement Learning for Skeleton-based Action Recognition,,\"Yansong Tang, Tsinghua University; Yi Tian, ; Peiyang Li, ; Jiwen Lu, Tsinghua University; Jie Zhou,\",\r\n1746,Poster,Dimensionality\u001as Blessing: Detecting the distributions underlying images,,\"Wen-Yan Lin, ADSC; Yasuyuki Matsushita, Osaka University; Siying Liu, I2r.a-star.edu.sg; Jianhuang Lai, Sun Yat-sen University\",\r\n1765,Poster,Object Referring in Videos with Language and Human Gaze,,\"Arun Balajee Vasudevan , ETH Zurich; Dengxin Dai, ETH Zurich; Luc Van Gool, KTH\",\r\n1773,Poster,Feature Mapping for Learning Fast and Accurate 3D Pose Inference from Synthetic Images,,\"Mahdi Rad, TUG; Markus Oberweger, ; Vincent Lepetit, TU Graz\",https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/feature-mapping/\r\n1790,Poster,Deep PhaseNet for Video Frame Interpolation,,\"Simone Meyer, ETH Zurich; Abdelaziz Djelouah, The Walt Disney Company; Christopher Schroers, Disney Research Zurich; Brian McWilliams, ; Alexander Sorkine-Hornung, ; Markus Gross,\",\r\n1792,Poster,Non-blind Deblurring: Handling Kernel Uncertainty with CNNs,,\"Subeesh Vasu, IIT Madras; Venkatesh Reddy Maligireddy, IIT Madras; A.N. Rajagopalan, IIT Madras\",\r\n1797,Poster,CosFace: Large Margin Cosine Loss for Deep Face Recognition,,\"Hao Wang, ; Yitong Wang, Tencent AI Lab; Zheng Zhou, ; xing Ji, ; Dihong Gong, ; Zhifeng Li, ; Jingchao Zhou, ; Wei Liu,\",\r\n1799,Poster,Lightweight Probabilistic Deep Networks,,\"Jochen Gast, TU Darmstadt; Stefan Roth,\",\r\n1800,Poster,Occlusion-Aware Rolling Shutter Rectification of 3D Scenes,,\"Subeesh Vasu, IIT Madras; Mahesh Mohan M R, IIT Madras; A.N. Rajagopalan, IIT Madras\",\r\n1815,Poster,CRRN: Multi-Scale Guided Concurrent Reflection Removal Network,,\"Renjie Wan, Nanyang Technological Universi; Boxin Shi, Peking University; Ling-Yu Duan, ; Ah-Hwee Tan, ; Alex Kot,\",\r\n1817,Poster,Natural and Effective Obfuscation by Head Inpainting,,\"Qianru Sun, MPI for Informatics; Liqian Ma, KU Leuven; Seong Joon Oh, MPI-INF; Mario Fritz, MPI, Saarbrucken, Germany; Luc Van Gool, KU Leuven; Bernt Schiele, MPI Informatics Germany\",\r\n1838,Poster,What do Deep Networks Like to See?,,\"Sebastian Palacio, DFKI; Joachim Folz, DFKI; Andreas Dengel, DFKI; J_n Hees, DFKI; Federico Raue, DFKI\",\r\n1860,Poster,Person Re-identification with Cascaded Pairwise Convolutions,,\"Yicheng Wang, ; Zhenzhong Chen, Wuhan University; Feng Wu, ; Gang Wang,\",\r\n1877,Poster,DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Network,,\"Shuang Ma, SUNY Buffalo; Jianlong Fu, ; Chang Chen, ; Tao Mei, Microsoft Research Asia\",\r\n1880,Poster,Deep Cocktail Networks: Multi-source Unsupervised Domain Adaptation with Category Shift,,\"Ruijia Xu, Sun Yat-sen University; Ziliang Chen, Sun Yat-sen University; Wangmeng Zuo, Harbin Institute of Technology; Junjie Yan, ; Liang Lin,\",\r\n1895,Poster,A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation,,\"Kang Wang, RPI; Rui Zhao, Rensselaer Polytechnic Institu; Qiang Ji, RPI\",\r\n1904,Poster,Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks,,\"Nick Johnston, Google; Damien Vincent, google.com; David Minnen, google.com; Michele Covell, google.com; Saurabh Singh, Univ. of Illinois at Urbana-Champaign; Sung Jin Hwang, google.com; George Toderici, Google; Troy Chinen, google.com; Joel Shor, google.com\",\r\n1914,Poster,3D Pose Estimation and 3D Model Retrieval for Objects in the Wild,,\"Alexander Grabner, Graz University of Technology; Peter Roth, Graz University of Technology; Vincent Lepetit, TU Graz\",\r\n1924,Poster,Motion-Appearance Co-Memory Networks for Video Question Answering,,\"Jiyang Gao, ; Runzhou Ge, Univ. of Southern California; Kan Chen, Univ. of Southern California; Ram Nevatia,\",\r\n1925,Poster,Temporal Deformable Residual Networks for Action Segmentation in Videos,,\"Peng Lei, Oregon State University; Sinisa Todorovic,\",\r\n1931,Poster,Spatially-Adaptive Filter Units for Deep Neural Networks,,\"Domen Tabernik, University of Ljubljana; Matej Kristan, University of Ljubljana; Ales Leonardis, University of Birmingham, UK\",\r\n1939,Poster,Boundary Flow: A Siamese Network that Predicts Boundary Motion without Training on Motion,,\"Peng Lei, Oregon State University; Fuxin Li, Oregon State University; Sinisa Todorovic,\",\r\n1949,Poster,Rolling Shutter and Radial Distortion are Features for High Frame Rate Multi-camera Tracking,,\"Akash Bapat, UNC Chapel Hill; Jan-Michael Frahm, UNC Chapel Hill; True Price, UNC Chapel Hill\",\r\n1960,Poster,Learning Deep Sketch Abstraction,,\"Umar Riaz Muhammad, Queen Mary Uni of London; Yongxin Yang, Queen Mary University of London; Yi-Zhe Song, ; Tao Xiang, Queen Mary University of London; Timothy Hospedales, University of Edinburgh\",\r\n1981,Poster,Learning to See in the Dark,,\"Chen Chen, UIUC; Qifeng Chen, Intel Labs; Jia Xu, Tencent AI Lab; Vladlen Koltun, Intel Labs\",\r\n1990,Poster,Uncalibrated Photometric Stereo under Natural Illumination,,\"Zhipeng Mo, ; Boxin Shi, Peking University; Feng Lu, U. Tokyo; Sai-Kit Yeung, ; Yasuyuki Matsushita, Osaka University\",\r\n1991,Poster,A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping,,\"Debang Li, CASIA; Huikai Wu, CASIA; Junge Zhang, ; Kaiqi Huang,\",\r\n2018,Poster,Very Large-Scale Global SfM by Distributed Motion Averaging,,\"Siyu Zhu, HKUST; Runze Zhang, HKUST; Lei Zhou, HKUST; Tianwei Shen, HKUST; Tian Fang, HKUST; Ping Tan, ; Long Quan, The Hong Kong University of Science and Technology, Hong Kong\",\r\n2021,Poster,ID-GAN: Learning a Symmetry Three-Player GAN for Identity-Preserving Face Synthesis,,\"Yujun Shen, Dept. of IE, CUHK; Ping Luo, The Chinese University of Hong Kong; Junjie Yan, ; Xiaogang Wang, Chinese University of Hong Kong; Xiaoou Tang, Chinese University of Hong Kong\",\r\n2040,Poster,DVQA: Understanding Data Visualization via Question Answering,,\"Kushal Kafle, ; Brian Price, ; Scott Cohen, ; Christopher Kanan, RIT\",\r\n2048,Poster,Globally Optimal Inlier Set Maximization for Atlanta Frame Estimation,,\"Kyungdon Joo, ; Tae-Hyun Oh, MIT; In So Kweon, KAIST; Jean-Charles Bazin, KAIST\",\r\n2064,Poster,End-to-end Convolutional Semantic Embeddings,,\"Quanzeng You, Microsoft; Zhengyou Zhang, Microsoft Research; Jiebo Luo, University of Rochester\",\r\n2097,Poster,MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition,,\"Yizhou Zhou, Univ of Scienc.&Tech. of China; Xiaoyan Sun, Microsoft; Zheng-Jun Zha, ; Wenjun Zeng,\",\r\n2102,Poster,Content-Sensitive Supervoxels via Uniform Tessellations on Video Manifolds,,\"Ran Yi, Tsinghua University; Yong-Jin Liu, ; Yu-Kun Lai, Cardiff University\",\r\n2111,Poster,3D Semantic Trajectory Reconstruction from 3D Pixel Continuum,,\"Jae Yoon, ; Ziwei Li, UMN; Hyun Park,\",\r\n2123,Poster,Feature Selective Networks for Object Detection,,\"Yao Zhai, University of Science and Technology of China; Jingjing Fu, ; Yan Lu, ; Houqiang Li,\",\r\n2131,Poster,Semantic Video Segmentation by Gated Recurrent Flow Propagation,,\"David Nilsson, Lund University; Cristian Sminchisescu,\",\r\n2135,Poster,A Constrained Deep Neural Network for Ordinal Regression,,\"Yanzhu Liu, Nanyang Technological Universi; Adams Kong, NTU Singapore ; Chi Keong Goh, Rolls-Royce Advanced Technology Centre\",\r\n2139,Poster,Mean-Variance Loss for Deep Age Estimation from a Face,,\"Hongyu Pan, Institute of Computing Technol; Hu Han, ; Shiguang Shan, Chinese Academy of Sciences; Xilin Chen,\",\r\n2162,Poster,Statistical Tomography of Microscopic Life,,\"Aviad Levis, Technion Institute of Technology; Ronen Talmon, Technion - Israel Institute of Technology; Yoav Schechner, Technion Haifa, Israel\",\r\n2165,Poster,Divide and Conquer for Full-Resolution Light Field Deblurring,,\"Mahesh Mohan M R, IIT Madras; A.N. Rajagopalan, IIT Madras\",\r\n2198,Poster,Occluded Pedestrian Detection through Guided Attention in CNNs,,\"Shanshan Zhang, MPI; Jian Yang, Nanjing University of Science and Technology; Bernt Schiele, MPI Informatics Germany\",\r\n2225,Poster,Conditional Image-to-Image Translation,,\"Jianxin Lin, USTC; Yingce Xia, ; Tao Qin, ; Zhibo Chen, ; Tie-Yan Liu,\",\r\n2231,Poster,Human Appearance Transfer,,\"Mihai Zanfir, IMAR and Lund University ; Alin-Ionut Popa, IMAR; Andrei Zanfir, IMAR and Lund University; Cristian Sminchisescu,\",\r\n2235,Poster,Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes,,\"Elisabeta Marinoiu, IMAR and Lund University; Andrei Zanfir, IMAR and Lund University; Cristian Sminchisescu,\",\r\n2236,Poster,Egocentric Basketball Motion Planning from a Single First-Person Image,,\"Gedas Bertasius, University of Pennsylvania; Aaron Chan, U. of Southern California; Jianbo Shi, University of Pennsylvania, USA\",\r\n2237,Poster,SGAN: An Alternative Training of Generative Adversarial Networks,,\"Tatjana Chavdarova, Idiap and EPFL; Francois Fleuret, Idiap Research Institute\",\r\n2240,Poster,3D Human Pose Reconstruction and Action Classification in Robot Assisted Therapy of Children with Autism,,\"Elisabeta Marinoiu, IMAR and Lund University; Mihai Zanfir, IMAR and Lund University ; Vlad Olaru, ; Cristian Sminchisescu,\",\r\n2260,Poster,Single Image Dehazing via Conditional Generative Adversarial Network,,\"Runde Li, NJUST; Jinshan Pan, UC Merced; Zechao Li, Nanjing University of Science and Technology ; Jinhui Tang,\",\r\n2268,Poster,\"Detect globally, refine locally: A novel approach to saliency detection\",,\"TIANTIAN WANG, Dalian University of Technolog; Lihe Zhang, Dalian University of Technology; Huchuan Lu, Dalian University of Technology; Ali Borji, UCF\",\r\n2285,Poster,Improving Landmark Localization with Semi-Supervised Learning,,\"Sina Honari, University of Montreal; Pavlo Molchanov, NVIDIA Research; Jan Kautz, NVIDIA; Stephen Tyree, ; Christopher Pal, Ecole Polytechnique de Montreal; Pascal Vincent, University of Montreal\",\r\n2290,Poster,Reward Learning by Instruction,,\"Hsiao-Yu Tung, Carnegie Mellon University; Adam Harley, Carnegie Mellon University; Katerina Fragkiadaki, Carnegie Mellon University\",\r\n2291,Poster,The Lov_z-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks,,\"Maxim Berman, ESAT-PSI, KU Leuven; Amal Rannen Triki, KU Leuven; Matthew Blaschko, KU Leuven\",\r\n2300,Poster,Facial Expression Recognition by De-expression Residue Learning,,\"Huiyuan Yang, Binghamton University-SUNY; Umur Ciftci, Binghamton University-SUNY; Lijun Yin, Binghamton University State University of New York\",\r\n2302,Poster,Learning to Understand Image Blur,,\"Shanghang Zhang, ; Xiaohui Shen, Adobe Research; Zhe Lin, Adobe Systems, Inc.; Radom_ Mech, ; Jo_ Costeira, ; Jose Moura, Carnegie Mellon University\",\r\n2307,Poster,Hierarchical Novelty Detection for Visual Object Recognition,,\"Kibok Lee, University of Michigan; Kimin Lee, KAIST; Kyle Min, University of Michigan; Yuting Zhang, University of Michigan; Jinwoo Shin, KAIST; Honglak Lee, University of Michigan, USA\",\r\n2313,Poster,Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data,,\"Arghya Pal, Indian Institute of Technology; Vineeth Balasubramanian, IIT Hyderabad\",\r\n2318,Poster,Compare and Contrast: Learning Prominent Visual Differences,,\"Steven Chen, University of Texas at Austin; Kristen Grauman,\",\r\n2368,Poster,Geometry-Guided CNN for Self-supervised Video Representation learning,,\"Chuang Gan, Tsinghua University; Boqing Gong, University of Central Florida; Kun Liu, Beijing University of Posts and Telecommunications; hao Su, ; Leonidas J. Guibas,\",\r\n2380,Poster,Multi-Level Fusion based 3D Object Detection from Monocular Images,,\"Bin Xu, ; Zhenzhong Chen, Wuhan University\",\r\n2383,Poster,Explicit Loss-Error-Aware Quantization for Deep Neural Networks,,\"Aojun Zhou, Intel labs china; Anbang Yao,\",\r\n2387,Poster,Generative Adversarial Perturbations,,\"Omid Poursaeed, Cornell University; Isay Katsman, Cornell University; Bicheng Gao, Shanghai Jiao Tong University; Serge Belongie,\",\r\n2391,Poster,A Hybrid L1-L0 Layer Decomposition Model for Tone Mapping,,\"Zhetong Liang, PolyU; Jun Xu, Hong Kong Polytechnic U; David Zhang, Hong Kong Polytechnic University; Zisheng Cao, ; Lei Zhang, The Hong Kong Polytechnic University\",\r\n2403,Poster,Learning Deep Correspondence through Prior and Posterior Feature Constancy,,\"Zhengfa Liang, NUDT; Yiliu Feng, NUDT; Yulan Guo, NUDT; Hengzhu Liu, NUDT; Wei Chen, ; Linbo Qiao, ; Li Zhou, NUDT; Jianfeng Zhang, NUDT\",\r\n2411,Poster,End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching,,\"Georgios Georgakis, George Mason University; Srikrishna Karanam, Siemens Corporate Technology; Ziyan Wu, Siemens Corporation; Jan Ernst, Siemens Corporation; Jana Kosecka, George Mason Univiversity\",\r\n2427,Poster,Matching Adversarial Networks,,\"Gellert Mattyus, UBER ATG; Raquel Urtasun, University of Toronto\",\r\n2434,Poster,Stochastic Variational Inference with Gradient Linearization,,\"Tobias Pl_z, TU Darmstadt; Anne Wannenwetsch, TU Darmstadt; Stefan Roth,\",\r\n2437,Poster,Geometry-aware Deep Network for Single-Image Novel View Synthesis,,\"Miaomiao Liu, Data61,CSIRO; Xuming He, ShanghaiTech; Mathieu Salzmann, EPFL\",\r\n2443,Poster,Robust Depth Estimation from Auto Bracketed Images,,\"Sunghoon Im, KAIST; Hae-Gon Jeon, KAIST; In So Kweon, KAIST\",\r\n2447,Poster,Document Enhancement using Visibility Detection,,\"Nati Kligler, Technion; Sagi Katz, Technion; Ayellet Tal, Technion\",\r\n2450,Poster,Co-Occurrence Template Matching,,\"Shai Avidan, ; rotal kat, Tel-Aviv University; roy jevnisek, Tel-Aviv University\",\r\n2451,Poster,Intrinsic Image Transformation via Scale Space Decomposition,,\"Lechao Cheng, ; Chengyi Zhang, Zhejiang University; Zicheng Liao,\",\r\n2455,Poster,Depth and Transient Imaging with Compressive SPAD Array Cameras,,\"Qilin Sun, KAUST; Xiong Dun, KAUST; Yifan (Evan) Peng, UBC; Wolfgang Heidrich,\",\r\n2457,Poster,Efficient and Deep Person Re-Identification using Multi-Level Similarity,,\"Yiluan Guo, SUTD; Ngai-Man Cheung,\",\r\n2463,Poster,SoS-RSC: A Sum-of-Squares Polynomial Approach to Robustifying Subspace Clustering Algorithms,,\"Octavia Camps, Northeastern University, USA; Mario Sznaier,\",\r\n2484,Poster,Arbitrary Style Transfer with Deep Feature Reshuffle,,\"Shuyang Gu, USTC; Congliang Chen, Peking University; Jing Liao, ; Lu Yuan, Microsoft Research Asia\",\r\n2486,Poster,Multi-Label Zero-Shot Learning with Structured Knowledge Graphs,,\"Chung-Wei Lee, National Taiwan University; Wei Fang, National Taiwan University; Chih-Kuan Yeh, Carnegie Mellon University; Yu-Chiang Frank Wang, Academia Sinica\",https://people.csail.mit.edu/weifang/project/vll18-mlzsl/\r\n2499,Poster,Mobile Video Object Detection with Temporally-Aware Feature Maps,,\"Menglong Zhu, ; Mason Liu, Georgia Tech\",\r\n2501,Poster,Free supervision from video games,,\"Philipp Krahenbuhl,\",\r\n2508,Poster,\"Fast, Simple, and Effective Resource-Constrained Structure Learning of Deep Networks\",,\"Ariel Gordon, Google; Elad Eban, Google; Bo Chen, Google; ofir Nachum, Google; Tien-Ju Yang, Massachusetts Institute of Technology; Edward Choi, Georgia Institute of Technology\",\r\n2518,Poster,Learning and Using the Arrow of Time,,\"Donglai Wei, MIT; Andrew Zisserman, Oxford; William Freeman, MIT/Google; Joseph Lim, University of Southern California\",\r\n2519,Poster,Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition,,\"Shizhong Han, 1986; zibo Meng, ; Zhiyuan Li, University of South Carolina; JAMES O'REILLY, University of South Carolina; Jie Cai, University of South Carolina; Xiaofeng Wang, University of South Carolina; Yan Tong, University of South Carolina\",\r\n2529,Poster,Gaze Prediction in Dynamic $360^\\circ$ Immersive Videos,,\"Yanyu Xu, Shanghaitech University; Yanbing Dong, ; Junru Wu, ; Zhengzhong Sun, ; Zhiru Shi, ; Jingyi Yu, ; Shenghua Gao, ShanghaiTech University\",\r\n2542,Poster,Modulated Convolutional Networks,,\"Xiaodi Wang, Beihang University; Baochang Zhang, ; Ce Li, CUMTB; Rongrong Ji, ; jungong han, ; Xianbin Cao, Beihang University; jianzhuang liu,\",\r\n2561,Poster,SeedNet : Automatic Seed Generation with Deep Reinforcement Learning for Robust Interactive Segmentation,,\"Gwangmo Song, Seoul National University; Heesoo Myeong, Samsung; Kyoung Mu Lee,\",\r\n2578,Poster,Preserving Semantic Relations for Zero-Shot Learning,,\"Yashas Annadani, NITK; Soma Biswas, Indian Institute of Science\",\r\n2582,Poster,What have we learned from deep representations for action recognition?,,\"Christoph Feichtenhofer, ; Axel Pinz, Graz University of Technology; Richard Wildes, York University; Andrew Zisserman, Oxford\",\r\n2588,Poster,Neural Style Transfer via Meta Networks,,\"Falong Shen, Peking University; Shuicheng Yan, ; Gang Zeng, Peking University\",\r\n2599,Poster,InverseFaceNet: Deep Monocular Inverse Face Rendering at over 250 Hz,,\"Hyeongwoo Kim, MPII; Michael Zollh_er, MPI Informatics; Ayush Tewari, MPI Informatics; Justus Thies, Technical University of Munich; Christian Richardt, University of Bath; Christian Theobalt, MPI Informatics\",\r\n2601,Poster,\"People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting\",,\"Mark Marsden, Dublin City University; Kevin McGuinness, DCU ; Suzanne Little, DCU; Ciara Keogh, University College Dublin, Ireland; Noel O'Connor, DCU\",\r\n2602,Poster,Multi-Frame Quality Enhancement for Compressed Video,,\"Ren Yang, Beihang University; Mai Xu, Beihang University; Zulin Wang, Beihang University; Tianyi Li, Beihang University\",\r\n2604,Poster,DiverseNet: When One Right Answer Is Not Enough,,\"Michael Firman, UCL; Neill Campbell, University of bath; Lourdes Agapito, University College London; Gabriel Brostow, University College London UK\",\r\n2605,Poster,Beyond the Pixel-Wise Loss for Topology-Aware Delineation,,\"Agata Mosinska, EPFL; Pablo Marquez Neila, EPFL; Mateusz Kozinski, ; Pascal Fua,\",\r\n2618,Poster,A Perceptual Measure for Deep Single Image Camera Calibration,,\"Yannick Hold-Geoffroy, Universit Laval; Kalyan Sunkavalli, Adobe Systems Inc.; Jonathan Eisenmann, Adobe Systems; Matthew Fisher, Adobe; Emiliano Gambaretto, Adobe Systems; Sunil Hadap, ; Jean-Francois Lalonde, Laval University\",\r\n2619,Poster,Show Me a Story: Towards Coherent Neural Story Illustration,,\"Hareesh Ravi, Rutgers University; Lezi Wang, Rutgers; Carlos Muniz, Rutgers University; Leonid Sigal, University of British Columbia; Mubbasir Kapadia, Rutgers University\",\r\n2624,Poster,Towards Universal Representation for Unseen Action Recognition,,\"Yi Zhu, University of California Merced; Yang Long, Newcastle University; Yu Guan, Newcastle University; Shawn Newsam, ; Ling Shao, University of East Anglia\",\r\n2628,Poster,A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects,,\"Yuanlu Xu, University of California, Los Angeles; Lei Qin, Institute of Computing Technology, Chinese Academy of Sciences; Xiaobai Liu, San Diego State University; Song-Chun Zhu,\",\r\n2633,Poster,Sparse Photometric 3D Face Reconstruction Guided by Morphable Models,,\"Xuan Cao, ShanghaiTech University; Zhang Chen, ShanghaiTech University; jingyi Yu, Shanghai Tech University; Anpei Chen,\",\r\n2635,Poster,Deep Adversarial Subspace Clustering,,\"Pan Zhou, National university of singapo; Yunqing Hou, NUS; Jiashi Feng,\",\r\n2644,Poster,KIPPI: KInetic Polygonal Partitioning of Images,,\"Jean-Philippe Bauchet, Inria; Florent Lafarge,\",\r\n2647,Poster,Planar Shape Detection at Structural Scales,,\"Hao Fang, Inria; Florent Lafarge, ; Mathieu Desbrun, Caltech\",\r\n2662,Poster,Reconstruction Network for Video Captioning,,\"Bairui Wang, ; Lin Ma, Tencent AI Lab; Wei Zhang, ; Wei Liu,\",\r\n2695,Poster,Tight Nonconvex Relaxation of MAP Inference,,\"D. Khu L-Huu, Inria & CentraleSup_ec, Universit Paris-Saclay; Nikos Paragios, Ecole Centrale de Paris\",\r\n2697,Poster,Weakly Supervised Phrase Localization with Multi-Scale Anchored Transformer Network,,\"Fang Zhao, National University of Singapore; Jianshu Li, National University of Singapo; Jian Zhao, NUS; Jiashi Feng,\",\r\n2701,Poster,Variational Autoencoders for Deforming 3D Mesh Models,,\"Qingyang Tan, UCAS; Lin Gao, Chinese Academy of Sciences; Yu-Kun Lai, Cardiff University; Shihong Xia, Institute of Computing Technology, CAS, Beijing, China\",\r\n2703,Poster,DeepMVS: Learning Multi-View Stereopsis,,\"Po-Han Huang, University of Illinois, U-C; Kevin Matzen, Facebook; Johannes Kopf, Facebook; Narendra Ahuja, University of Illinois at Urbana-Champaign, USA; Jia-Bin Huang, Virginia Tech\",\r\n2705,Poster,HydraNets: Specialized Dynamic Architectures for Efficient Inference,,\"Ravi Teja Mullapudi, Carnegie Mellon University; Noam Shazeer, Google; William Mark, Google; Kayvon Fatahalian, Stanford\",\r\n2715,Poster,Pix3D: Dataset and Methods for 3D Object Modeling from a Single Image,,\"Xingyuan Sun, Shanghai Jiao Tong University; Jiajun Wu, MIT; Xiuming Zhang, MIT; Zhoutong Zhang, MIT; Tianfan Xue, Google; Joshua Tenenbaum, ; William Freeman, MIT/Google\",\r\n2722,Poster,Defense against Universal Adversarial Perturbations,,\"NAVEED AKHTAR, UNIVERSITY OF WESTERN AUSTRALI; Jian Liu, UWA; Ajmal Mian, UWA\",\r\n2735,Poster,DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation,,\"Jiang Liu, Carnegie Mellon University; Chenqiang Gao, Chongqing University of Posts and Telecommunications; Deyu Meng, Xi'an Jiaotong University; Alexander Hauptmann,\",\r\n2738,Poster,Deeply Learned Filter Response Functions for Hyperspectral Reconstruction,,\"Shijie Nie, NII, Japan; Lin Gu, National Institute of Informatics; Yinqiang Zheng, National Institute of Informatics, Japan; Antony Lam, Saitama University; Nobutaka Ono, Tokyo Metropolitan University; Imari Sato, National Institute of Informatics, Japan\",\r\n2742,Poster,Learning Strict Identity Mappings in Deep Residual Networks,,\"Xin Yu, University of Utah; Srikumar Ramalingam, ; Zhiding Yu, Carnegie Mellon University\",\r\n2744,Poster,Face Detector Adaptation without Negative Transfer or Catastrophic Forgetting,,\"Muhammad Abdullah Jamal, University of Central Florida; Haoxiang Li, Adobe Research; Boqing Gong, University of Central Florida\",\r\n2752,Poster,\"Multi-Evidence Fusion and Filtering for Weakly Supervised Object Recognition, Detection and Segmentation\",,\"Weifeng Ge, The University of Hong Kong; Yizhou Yu, The University of Hong Kong\",\r\n2764,Poster,Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams,,\"Daesik Kim, Seoul National University; YoungJoon Yoo, ; JeeSoo Kim, Seoul national university; SangKuk Lee, Seoul National University; Nojun Kwak, Seoul National University\",\r\n2767,Poster,Jerk-Aware Video Acceleration Magnification,,\"Shoichiro Takeda, NTT Media Intelligence Lab.; Kazuki Okami, NTT Media Intelligence Lab.; Dan Mikami, NTT Media Intelligence Lab.; Megumi Isogai, NTT Media Intelligence Lab.; Hideaki Kimata, NTT Media Intelligence Lab.\",\r\n2779,Poster,Mining on Manifolds: Metric Learning without Labels,,\"Ahmet Iscen, Inria; Giorgos Tolias, Czech Technical University in Prague; Yannis Avrithis, Inria; Ondrej Chum, Czech Technical University in Prague\",\r\n2788,Poster,Referring Image Segmentation via Recurrent Refinement Networks,,\"Ruiyu Li, CUHK; Kaican Li, CUHK; Yi-Chun Kuo, CUHK; Michelle Shu, ; Xiaojuan Qi, CUHK; Xiaoyong Shen, CUHK; Jiaya Jia, Chinese University of Hong Kong\",\r\n2797,Poster,Disentangling Factors of Variation by Mixing Them,,\"Qiyang HU, University of bern; Attila Szabo, University of Bern; Tiziano Portenier, ; Matthias Zwicker, ; Paolo Favaro, Bern University, Switzerland\",\r\n2798,Poster,LSTM Pose Machines,,\"Yue Luo, SenseTime; Jimmy Ren, SenseTime Group Limited; Zhouxia Wang, SenseTime; Wenxiu Sun, SenseTime Group Limited; Jinshan Pan, UC Merced; Jianbo Liu, SenseTime; Jiahao Pang, SenseTime Group Limited; Liang Lin,\",\r\n2799,Poster,CNN based Learning using Reflection and Retinex Models for Intrinsic Image Decomposition,,\"Anil Baslamisli, University of Amsterdam; Hoang-An Le, University of Amsterdam; Theo Gevers, University of Amsterdam\",\r\n2821,Poster,Translating and Segmenting Multimodal Medical Volumes with Cycle- and Shape-Consistency Generative Adversarial Network,,\"Zizhao Zhang, University of Florida; Lin Yang, ; Yefeng Zheng, Simens\",\r\n2828,Poster,Human Semantic Parsing for Person Re-identification,,\"Mahdi Kalayeh, UCF; Emrah Basaran, ; Mubarak Shah, UCF\",\r\n2834,Poster,Learned Shape-Tailored Descriptors for Segmentation,,\"Naeemullah Khan, KAUST; Ganesh Sundaramoorthi,\",\r\n2835,Poster,ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans,,\"Angela Dai, ; Daniel Ritchie, Brown University; Martin Bokeloh, Google; Scott Reed, Google; Juergen Sturm, Google; Matthias Niener, Technical University of Munich\",\r\n2839,Poster,Learning Compressible 360 Video Isomers,,\"Yu-Chuan Su, UT Austin; Kristen Grauman,\",\r\n2844,Poster,Geometric robustness of deep networks: analysis and improvement,,\"Can Kanbak, EPFL; Seyed-Mohsen Moosavi-Dezfooli, ; Pascal Frossard,\",\r\n2848,Poster,Weakly Supervised Facial Action Unit Recognition through Adversarial Training,,\"Guozhu Peng, USTC; Shangfei Wang,\",\r\n2852,Poster,Disentangling Features in 3D Face Shapes for Joint Face Reconstruction and Recognition,,\"Feng Liu, Sichuan University; Dan Zeng, Sichuan University; Qijun Zhao, Sichuan University; Xiaoming Liu, Michigan State University\",\r\n2859,Poster,View Extrapolation of Human Body from a Single Image,,\"Hao Zhu, Nanjing University; hao Su, ; Peng Wang, Baidu; Xun Cao, EE Department, Nanjing Univ; Ruigang Yang, University of Kentucky\",\r\n2864,Poster,Adversarially Occluded Samples for Person Re-identification,,\"Houjing Huang, CASIA; Dangwei Li, ; Zhang Zhang, ; Xiaotang Chen, ; Kaiqi Huang,\",\r\n2873,Poster,Single-Image Depth Estimation Based on Fourier Domain Analysis,,\"Jaehan Lee, Korea University; Minhyeok Heo, Korea Unversity; Kyung-Rae Kim, Korea University; Chang-Su Kim,\",\r\n2896,Poster,Classifier Learning with Prior Probabilities for Facial Action Unit Recognition,,\"Yong Zhang, CASIA; Weiming Dong, ; Bao-Gang Hu, CASIA; Qiang Ji, RPI\",\r\n2914,Poster,Image Restoration by Estimating Frequency Distribution of Local Patches,,\"Jaeyoung Yoo, Seoul National University; Sang ho Lee, Seoul National University; Nojun Kwak, Seoul National University\",\r\n2932,Poster,Domain Generalization with Adversarial Feature Learning,,\"Haoliang Li, Nanyang Technological Universi; Sinno Jilain Pan, Nanyang Technological University, Singapore; Shiqi Wang, City University of Hong Kong; Alex Kot,\",\r\n2936,Poster,Camera Pose Estimation with Unknown Principal Point,,\"Viktor Larsson, Lund University; Zuzana Kukelova, Czech Technical University in Prague; Yinqiang Zheng, National Institute of Informatics, Japan\",\r\n2954,Poster,Image Blind Denoising With Generative Adversarial Network Based Noise Modeling,,\"Jingwen Chen, Sun Yat-sen University; Jiawei Chen, Sun Yat-sen University; Hongyang Chao, Sun Yat-sen University; Ming Yang,\",\r\n2967,Poster,Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View,,\"Albert Pumarola, IRI (CSIC-UPC); Antonio Agudo, IRI (CSIC-UPC); Lorenzo Porzi, Mapillary Research; Alberto Sanfeliu, IRI (CSIC-UPC); Vincent Lepetit, University of Bordeaux; Francesc Moreno-Noguer, Institut de Robotica i Informatica Industrial (UPC/CSIC)\",\r\n2970,Poster,Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars,,\"Antonio Loquercio, University of Zurich; Ana Maqueda, Universidad Politecnica de Madrid; Guillermo Gallego, University of Zurich; Narciso Garcia, Universidad Politecnica de Madrid; Davide Scaramuzza, University of Zurich\",\r\n2979,Poster,PoTion: Pose MoTion Representation for Action Recognition,,\"Vasileios Choutas, Naver Labs Europe; Philippe Weinzaepfel, Xerox; Jerome Revaud, Naver Labs Europe; Cordelia Schmid, INRIA Grenoble, France\",\r\n2986,Poster,Deep Lesion Graph in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database,,\"Ke Yan, National Institute of Health; Xiaosong Wang, NIH; Le Lu, Nvidia Corp; Ling Zhang, NIH; Adam Harrison, National Institutes of Health; MOHAMMADHADI Bagheri, NIH; Ronald Summers,\",\r\n2988,Poster,Inverse Composition Discriminative Optimization for Point Cloud Registration,,\"Jayakorn Vongkulbhisal, Carnegie Mellon University; Be_t Irastorza Ugalde, ; Fernando de la Torre, ; Jo_ Costeira,\",\r\n2992,Poster,Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation,,\"Piotr Bilinski, University of Oxford; Victor Prisacariu, Oxford\",\r\n2998,Poster,Logo Synthesis and Manipulation with Clustered Generative Adversarial Networks,,\"Alexander Sage, ETH Zurich; Eirikur Agustsson, ETH Zurich; Radu Timofte, ETH Zurich; Luc Van Gool, KTH\",\r\n3010,Poster,Learning Answer Embeddings for Visual Question Answering,,\"Hexiang Hu, ; Wei-Lun Chao, USC; Fei Sha, University of Southern California\",\r\n3026,Poster,Sim2Real View Invariant Visual Servoing by Recurrent Control,,\"Fereshteh Sadeghi, University of Washington; Alexander Toshev, Google; Sergey Levine, UC Berkeley\",\r\n3027,Poster,Spanning Patches: Deep Patch Selection for Fast Multi-View Stereo,,\"Alex Poms, Carnegie Mellon University; Shoou-I Yu, Oculus; Chenglei Wu, Oculus; Yaser Sheikh,\",\r\n3028,Poster,Efficient Large-scale Approximate Nearest Neighbor Search on OpenCL FPGA,,\"Jialiang Zhang, University of Wisconsin-Madiso; Soroosh Khoram, UW-Madison; Jing Li, University of Wisconsin-Madison\",\r\n3035,Poster,Learning distributions of shape trajectories from longitudinal datasets: a hierarchical model on a manifold of diffeomorphisms,,\"Alexandre B_e, Brain and Spine Institute; Olivier Colliot, Institut du Cerveau et de la Moelle _ini_e; Stanley Durrleman, Institut du Cerveau et de la Moelle _ini_e\",\r\n3057,Poster,Geometry Aware Optimization for Deep Learning: The Good Practice,,\"SOUMAVA KUMAR ROY, AUSTRALIAN NATIONAL UNIVERSITY; Zakaria Mhammedi, Data61, CSIRO; Mehrtash Harandi, Australian National University\",\r\n3063,Poster,Good View Hunting: Learning Photo Composition from 1 Million View Pairs,,\"Zijun Wei, Stony Brook University; Jianming Zhang, Adobe Research; Minh Hoai, Stony Brook University; Xiaohui Shen, Adobe Research; Zhe Lin, Adobe Systems, Inc.; Radom_ Mech, ; Dimitris Samaras,\",\r\n3073,Poster,Analyzing Filters Toward Efficient ConvNet,,\"Takumi Kobayashi,\",\r\n3075,Poster,Bilateral Ordinal Relevance Multi-instance Regression for Facial Action Unit Intensity Estimation,,\"Yong Zhang, CASIA; Rui Zhao, Rensselaer Polytechnic Institu; Weiming Dong, ; Bao-Gang Hu, CASIA; Qiang Ji, RPI\",\r\n3102,Poster,Interpretable Video Captioning via Trajectory Structured Localization,,\"Xian Wu, Sysu; Guanbin Li, ; Liang Lin,\",\r\n3107,Poster,Joint Optimization Framework for Learning with Noisy Labels,,\"Daiki Tanaka, The University of Tokyo; Daiki Ikami, The University of Tokyo; Toshihiko Yamasaki, The University of Tokyo; Kiyoharu Aizawa,\",\r\n3112,Poster,Learning a Complete Image Indexing Pipeline,,\"Himalaya Jain, Inria, Technicolor; Joaquin Zepeda, ; Patrick Perez, Technicolor Research; R_i Gribonval, Inria\",\r\n3118,Poster,Inferring Co-Attention in Social Scene Videos,,\"Lifeng Fan, VCLA@UCLA; Yixin Chen, VCLA@UCLA; Ping Wei, Xi'an Jiaotong University; Song-Chun Zhu,\",\r\n3120,Poster,A Network Architecture for Point Cloud Classification via Automatic Depth Images Generation,,\"Lukas Rahmann, ; Riccardo Roveri, ETH Zurich; Cengiz Oztireli, ; Markus Gross,\",\r\n3123,Poster,Blind Predicting Similar Quality Map for Image Quality Assessment,,\"Da Pan, Communication University of CN; Ping Shi, ; Ming Hou, ; Zefeng Ying, ; Sizhe Fu, ; Yuan Zhang,\",\r\n3141,Poster,Image Correction via Deep Reciprocating HDR Transformation,,\"Xin Yang, Dalian University of Technology, City University of Hong Kong; Ke Xu, Dalian University of Technology; City University of Hong Kong; Yibing Song, Tencent AI Lab; Qiang Zhang, Dalian University of Technology; Xiaopeng Wei, Dalian University of Technology; Rynson Lau, City University of Hong Kong\",\r\n3145,Poster,Towards Human-Machine Cooperation: Evolving Active Learning with Self-supervised Process for Object Detection,,\"Keze Wang, ; Liang Lin, ; Xiaopeng Yan, Sun Yat-sen University; Lei Zhang, The Hong Kong Polytechnic University\",\r\n3170,Poster,PoseFlow: A Deep Motion Representation for Understanding Human Behaviors in Videos,,\"Dingwen Zhang, ; Guangyu Guo, ; Dong Huang, Carnegie Mellon University; Fernando de la Torre, ; Junwei Han, Northwestern Polytechnical U.\",\r\n3208,Poster,Deep Density Clustering of Unconstrained Faces,,\"Wei-An Lin, UMD; Jun-Cheng Chen, ; Carlos Castillo, ; Rama Chellappa, University of Maryland, USA\",\r\n3214,Poster,Learning Steerable Filters for Rotation Equivariant CNNs,,\"Maurice Weiler, Heidelberg University; Fred Hamprecht, Heidelberg University, Germany; Martin Storath,\",\r\n3219,Poster,Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection,,\"Hao Chen, City University of Hong Kong; You fu Li, City University of Hong Kong\",\r\n3220,Poster,Regularizing Deep Networks by Modeling and Predicting Label Structure,,\"Mohammadreza Mostajabi, TTI-Chicago; Michael Maire, ; Greg Shakhnarovich,\",\r\n3236,Poster,Probabilistic Joint Face-Skull Modelling for Facial Reconstruction,,\"Dennis Madsen, University of Basel; Marcel L_hi, ; Andreas Schneider, ; Thomas Vetter, U. Basel\",\r\n3241,Poster,Making Convolutional Networks Recurrent for Visual Sequence Learning,,\"Xiaodong Yang, NVIDIA; Pavlo Molchanov, NVIDIA Research; Jan Kautz, NVIDIA\",\r\n3242,Poster,Zero-Shot Kernel Learning.,,\"Hongguang Zhang, Data61; Piotr Koniusz, Data61/CSIRO\",\r\n3244,Poster,Deeper Look at Power Normalizations.,,\"Piotr Koniusz, Data61/CSIRO; Hongguang Zhang, Data61; Fatih Porikli, NICTA, Australia\",\r\n3249,Poster,On the Duality Between Retinex and Image Dehazing,,\"Adrian Galdran, INESC TEC Porto; Aitor Alvarez-Gila, Tecnalia / CVC-Universitat Autonoma de Barcelona; Alessandro Bria, University of Cassino and L.M.; Javier Vazquez-Corral, Universitat Pompeu Fabra; Marcelo Bertalmio,\",\r\n3259,Poster,Pulling Actions out of Context: Explicit Separation for Effective Combination,,\"Yang Wang, Stony Brook University; Minh Hoai, Stony Brook University\",\r\n3269,Poster,Multi-Scale Weighted Nuclear Norm Image Restoration,,\"Noam Yair, Technion; Tomer Michaeli, Technion\",\r\n3272,Poster,Finding beans in burgers: Deep semantic-visual embedding with localization,,\"Patrick Perez, Technicolor Research; Matthieu Cord, ; Louis Chevallier, technicolor; Martin Engilberge, technicolor\",\r\n3279,Poster,Transparency by Design: Closing the Gap Between Performance and Interpretabilty in Visual Reasoning,,\"David Mascharka, MIT Lincoln Laboratory; Philip Tran, Planck Aerosystems; Ryan Soklaski, MIT Lincoln Laboratory; Arjun Majumdar, MIT Lincoln Laboratory\",\r\n3285,Poster,Occlusion Aware Unsupervised Learning of Optical Flow,,\"Yang Wang, Baidu USA; Yi Yang, ; Zhenheng Yang, ; Liang Zhao, Baidu USA; Wei Xu,\",\r\n3286,Poster,PAD-Net: Multi-Tasks Guided Prediciton-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing,,\"Dan Xu, ; Wanli Ouyang, The University of Sydney; Xiaogang Wang, Chinese University of Hong Kong; Nicu Sebe, University of Trento\",\r\n3295,Poster,Fooling Vision and Language Models Despite Localization and Attention Mechanism,,\"Xiaojun Xu, Shanghai Jiao Tong University; Xinyun Chen, UC Berkeley; Chang Liu, UC Berkeley; Anna Rohrbach, UC Berkeley; Trevor Darrell, UC Berkeley, USA; Dawn Song, UC Berkeley\",\r\n3299,Poster,4DFAB: A Large Scale 4D Database for Facial Expression Analysis and Biometric Applications,,\"Shiyang Cheng, Imperial College London; Irene Kotsia, Middlesex University London; Maja Pantic, Imperial College London, UK; Stefanos Zafeiriou, Imperial College London\",\r\n3303,Poster,An Efficient and Provable Approach for Mixture Proportion Estimation Using Linear Independence Assumption,,\"Xiyu Yu, The University of Sydney; Tongliang Liu, The University of Sydney; Mingming Gong, ; Kayhan Batmanghelich, University of Pittsburgh; Dacheng Tao, University of Sydney\",\r\n3311,Poster,Nonlocal Low-Rank Tensor Factor Analysis for Image Restoration,,\"Xinyuan Zhang, Duke University; Xin Yuan, Nokia Bell Labs; Lawrence Carin,\",\r\n3315,Poster,Hierarchical Recurrent Attention Networks for Structured Online Maps,,\"Namdar Homayounfar, Uber ATG; Wei-Chiu Ma, MIT; Shrinidhi Kowshika Lakshmikanth, Uber ATG; Raquel Urtasun, University of Toronto\",\r\n3327,Poster,Geometric Multi-Model Fitting with a Convex Relaxation Algorithm,,\"Paul Amayo, Oxford ; Pedro Pinies, University of Oxford; Lina Paz, University of Oxford; Paul Newman, University of Oxford\",\r\n3335,Poster,Image to Image Translation for Domain Adaptation,,\"Zak Murez, UCSD; Soheil Kolouri, HRL Laboratories, LLC; David Kriegman, University of California at San Diego; Ravi Ramamoorthi, University of California, San Diego; Kyungnam Kim, HRL Laboratories\",\r\n3345,Poster,CNN Driven Sparse Multi-Level B-spline Image Registration,,\"Pingge Jiang, Drexel University; James Shackleford, Drexel University\",\r\n3349,Poster,DocUNet: Document Image Unwarping via A Stacked U-Net,,\"Ke Ma, Stony Brook University; Zhixin Shu, Stony Brook University; Xue Bai, Megvii Inc; Jue Wang, Megvii; Dimitris Samaras,\",\r\n3350,Poster,Texture Mapping for 3D Reconstruction with RGB-D Sensor,,\"Yanping Fu, WuHan University; Qingan Yan, JD.com; Long Yang, Northwest A&F University; Jie Liao , WuHan University; Chunxia Xiao, Wuhan University\",\r\n3352,Poster,Sliced Wasserstein Distance for Learning Gaussian Mixture Models,,\"Soheil Kolouri, HRL Laboratories, LLC; Gustavo Rohde, University Virginia ; Heiko Hoffmann, HRL Laboratories, LLC\",\r\n3373,Poster,Composing Two Objects of Interest for Flying Camera Photography,,\"ZIQUAN LAN, NUS; David Hsu, NUS; Gim Hee Lee, National University of SIngapore\",\r\n3375,Poster,Time-resolved Light Transport Decomposition for Thermal Photometric Stereo,,\"Nobuhiro Ikeya, NAIST; Kenichiro Tanaka, NAIST; Tsuyoshi Takatani, NAIST; Hiroyuki Kubo, ; Takuya Funatomi, NAIST; Yasuhiro Mukaigawa, NAIST\",\r\n3383,Poster,Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation,,\"Zhen Zhang, WASHINGTON UNIVERSITY IN ST.LO; Mianzhi Wang, WASHINGTON UNIVERSITY IN ST.LOUIS; Yan Huang, ; Arye Nehorai, WASHINGTON UNIVERSITY IN ST.LOUIS\",\r\n3408,Poster,Feature Super-Resolution: Make Machine See More Clearly,,\"Weimin Tan, Fudan University; Bo Yan, Fudan University; Bahetiyaer Bare, Fudan University\",\r\n3412,Poster,CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition,,\"Jedrzej Kozerawski, UCSB; Matthew Turk, UC Santa Barbara USA\",\r\n3427,Poster,\"Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation\",,\"Mark Sandler, Google; Andrew Howard, Google; Menglong Zhu, ; Andrey Zhmoginov, Google; Liang-Chieh Chen,\",\r\n3435,Poster,CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise,,\"Kuang-Huei Lee, Microsoft; Xiaodong He, ; Lei Zhang, Microsoft; Linjun Yang, Facebook\",\r\n3442,Poster,MoNet: Moments Embedding Network,,\"Mengran Gou, Northeastern University; Fei Xiong, University of Southern California ; Octavia Camps, Northeastern University, USA; Mario Sznaier,\",\r\n3446,Poster,Self-calibrating polarising radiometric calibration,,\"Daniel Teo, SUTD; Boxin Shi, Peking University; Yinqiang Zheng, National Institute of Informatics, Japan; Sai-Kit Yeung,\",\r\n3448,Poster,Representing and Learning High Dimensional Data with the Optimal Transport Map from a Probabilistic Viewpoint,,\"Serim Park, Oath; Matthew Thorpe,\",\r\n3460,Poster,ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information,,\"Rodney LaLonde, University of Central Florida; Dong Zhang, University of Central Florida; Mubarak Shah, UCF\",\r\n3468,Poster,Seeing Small Faces from Robust Anchor's Perspective,,\"Chenchen Zhu, Carnegie Mellon University; Ran Tao, Carnegie Mellon University; Khoa Luu, ; Marios Savvides,\",\r\n3470,Poster,Gesture Recognition: Focus on the Hands,,\"Pradyumna Narayana, Colorado State University; Ross Beveridge, Colorado State University; Bruce Draper, Colorado State University\",\r\n3483,Poster,PieAPP: Perceptual Image-Error Assessment through Pairwise Preference,,\"Ekta Prashnani, UCSB; Hong Cai, University of California, Santa Barbara; Yasamin Mostofi, UCSB; Pradeep Sen, University of California, Santa Barbara\",\r\n3494,Poster,FFNet: Video Fast-Forwarding via Reinforcement Learning,,\"Shuyue Lan, Northwestern University; Rameswar Panda, UC Riverside; Qi Zhu, UC Riverside; Amit Roy-Chowdhury, UC Riverside\",\r\n3495,Poster,Two can play this Game: Visual Dialog with Discriminative Visual Question Generation and Visual Question Answering,,\"Unnat Jain, UIUC; Lana Lazebnik, ; Alex Schwing,\",\r\n3502,Poster,A Prior-Less Method for Multi-Face Tracking in Unconstrained Videos,,\"CHUNG-CHING LIN, IBM Research; Ying Hung, Rutgers University\",\r\n3508,Poster,Analytical Modeling of Vanishing Points and Curves in Catadioptric Cameras,,\"Pedro Miraldo, Instituto Superior T_nico, Lisboa; Francisco Girbal Eiras, University of Oxford; Srikumar Ramalingam,\",\r\n3531,Poster,Coding Kendall's Shape Trajectories for 3D Action Recognition,,\"Amor Ben Tanfous, IMT Lille Douai; Hassen Drira, IMT Lille Douai; Boulbaba Ben Amor, IMT Lille Douai\",\r\n3552,Poster,Fast and Robust Estimation for Unit-Norm Constrained Linear Fitting Problems,,\"Daiki Ikami, The University of Tokyo; Toshihiko Yamasaki, The University of Tokyo; Kiyoharu Aizawa,\",\r\n3558,Poster,Local and Global Optimization Techniques in Graph-based Clustering,,\"Daiki Ikami, The University of Tokyo; Toshihiko Yamasaki, The University of Tokyo; Kiyoharu Aizawa,\",\r\n3569,Poster,Generating a Fusion Image: One' s Identity and Another's Shape,,\"DongGyu Joo, KAIST; Doyeon Kim, KAIST; Junmo Kim, KAIST\",\r\n3576,Poster,Categorizing Concepts with Basic Level for Vision-to-Language,,\"Hanzhang Wang, Tongji University; Hanli Wang, Tongji University; Kaisheng Xu, Tongji University\",\r\n3577,Poster,Importance Weighted Adversarial Nets for Partial Domain Adaptation,,\"Jing Zhang, University of Wollongong; Zewei Ding, University of Wollongong; Wanqing Li, ; Philip Ogunbona, University of Wollongong\",\r\n3604,Poster,Multi-task Learning by Maximizing Statistical Dependence,,\"Youssef Alami Mejjati, University of Bath; Darren Cosker, University of Bath; Kwang In Kim, University of Bath\",\r\n3617,Poster,Encoder-Decoder Alignment for Zero-Pair Image-to-Image Translation,,\"Yaxing Wang, Computer vision center; Joost van de Weijer, Computer Vision Center Barcelona; Luis Herranz, Computer Vision Center\",\r\n3618,Poster,Dynamic Feature Learning for Partial Face Recognition,,\"Lingxiao He, Institute of Automation\u001aChines; Haiqing Li, ; qi Zhang, ; Zhenan Sun, CRIPAC\",\r\n3629,Poster,Multi-Image Semantic Matching by Mining Consistent Features,,\"Qianqian Wang, Zhejiang University; Xiaowei Zhou, Zhejiang University; Kostas Daniilidis, University of Pennsylvania\",\r\n3630,Poster,Indoor RGB-D Compass from a Single Line and Plane,,\"Pyojin Kim, Seoul National University; Brian Coltin, NASA Ames Research Center; H. Jin Kim,\",\r\n3636,Poster,Multispectral Image Intrinsic Decomposition via Low Rank Constraint,,\"Qian Huang, Nanjing University; Zhu Weixin, Nanjing university; Yang Zhao, Nanjing University; Linsen Chen, Nanjing University; yao wang, new york university; Tao Yue, Nanjing Univ.; Xun Cao, EE Department, Nanjing Univ\",\r\n3646,Poster,Non-Linear Temporal Subspace Representations for Activity Recognition,,\"Anoop Cherian, ; Suvrit Sra, MIT; Stephen Gould, Australian National University; Richard Hartley, Australian National University Australia\",\r\n3680,Poster,MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features,,\"Liang-Chieh Chen, ; Alexander Hermans, RWTH Aachen University; George Papandreou, Google Inc.; Florian Schroff, Google Inc.; Peng Wang, Baidu; Hartwig Adam, Google\",\r\n3687,Poster,Robust Classification with Convolutional Prototype Learning,,\"Hong-Ming Yang, Institute of Automation, Chinese Academy of Sciences; Xu-Yao Zhang, Institute of Automation, Chinese Academy of Sciences; Fei Yin, Institute of Automation, Chinese Academy of Sciences; cheng-lin Liu,\",\r\n3695,Poster,Normalized Cut Loss for Weakly Supervised CNN Segmentation,,\"Meng Tang, UWO; Federico Perazzi, Disney Research Zurich; Abdelaziz Djelouah, The Walt Disney Company; Yuri Boykov, University of Western Ontario; Christopher Schroers, Disney Research Zurich\",\r\n3711,Poster,CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization,,\"Frederick Tung, Simon Fraser University; Greg Mori,\",\r\n3713,Poster,Inference in Higher Order MRF-MAP Problems with Small and Large Cliques,,\"Ishant Shanu, Iiit delhi; Chetan Arora, Indraprastha Institute of Information Technology Delhi; S.N. Maheshwari, IIT Delhi\",\r\n3736,Poster,Learning to Estimate 3D Human Pose and Shape from a Single Color Image,,\"Georgios Pavlakos, ; Luyang Zhu, Peking University; Xiaowei Zhou, Zhejiang University; Kostas Daniilidis, University of Pennsylvania\",\r\n3739,Poster,Revisiting knowledge transfer for training object class detectors,,\"Jasper Uijlings, Google; Stefan Popov, Google; Vitto Ferrari,\",\r\n3751,Poster,Aperture Supervision for Monocular Depth Estimation,,\"Pratul Srinivasan, Berkeley; Rahul Garg, ; Neal Wadhwa, ; Ren Ng, Berkeley; Jonathan Barron, Google\",\r\n3780,Poster,A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos,,\"Michel Silva, Universidade de Minas Gerais; Washington Luis Ramos, Universidade Federal de Minas Gerais; Jo_ Pedro Ferreira, Universidade Federal de Minas Gerais; Felipe Chamone, Universidade Federal de Minas Gerais; Mario F Campos, Universidade Federal de Minas Gerais; Erickson Nascimento, Universidade Federal de Minas Gerais\",\r\n3798,Poster,Active Fixation Control to Predict Saccade Sequences,,\"Calden Wloka, York University; Iuliia Kotseruba, York University; John Tsotsos, York University Canada\",\r\n3818,Poster,Depth-Aware Stereo Video Retargeting,,\"Bing Li, University of Southern Califor; Chia-Wen Lin, ; Tiejun Huang, ; Boxin Shi, Peking University; Wen Gao, ; C.-C. Jay Kuo, University of Southern California\",\r\n3822,Poster,Learning from Noisy Web Data with Category-level Supervision,,\"Li Niu, Rice University; Qingtao Tang, ; Ashok Veeraraghavan, Rice University; Ashutosh Sabharwal,\",\r\n3826,Poster,\"Pixels, voxels, and views: A study of shape representations for single view 3D object shape prediction\",,\"Daeyun Shin, UC Irvine; Charless Fowlkes, University of California, Irvine, USA; Derek Hoiem,\",\r\n3855,Poster,Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning,,\"Vasili Ramanishka, Boston University; Yi-Ting Chen, Honda Research Institute USA; Teruhisa Misu, Honda Research Institute; Kate Saenko,\",\r\n3857,Poster,Generative Adversarial Image Synthesis with Decision Tree Latent Controller,,\"Takuhiro Kaneko, NTT Corporation; Kaoru Hiramatsu, NTT Corporation; Kunio Kashino, NTT\",\r\n3867,Poster,Cross-View Image Synthesis using Conditional Generative Adversarial Nets,,\"Krishna Regmi, Ucf; Ali Borji, UCF\",\r\n3869,Poster,Focus Manipulation Detection via Photometric Histogram Analysis,,\"Can Chen, University of Delaware; Scott McCloskey, Honeywell; Jingyi Yu, University of Delaware, USA\",\r\n3870,Poster,\"Efficient, sparse representation of manifold distance matrices for classical scaling\",,\"Alexander Huth, University of Texas at Austin; Javier Turek, Intel Corporation\",\r\n3871,Poster,A Robust Method for Strong Rolling Shutter Effects Correction Using Lines with Automatic Feature Selection,,\"Yizhen Lao, Institut Pascal; Omar Ait-Aider, Institut Pascal\",\r\n3878,Poster,Learning Attribute Representations with Localization for Flexible Fashion Search,,\"Kenan Ak, National University of Singapo; Joo Hwee Lim, I2R, Astar; Ashraf Kassim, ; JO YEW THAM,\",\r\n3887,Poster,Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty,,\"Apratim Bhattacharyya, MPI Informatics; Bernt Schiele, MPI Informatics Germany; Mario Fritz, MPI, Saarbrucken, Germany\",\r\n3903,Poster,Wrapped Gaussian Process Regression on Riemannian Manifolds,,\"Anton Mallasto, University of Copenhagen; Aasa Feragen, University of Copenhagen\",\r\n3923,Poster,ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes,,\"Yuhua Chen, CVL@ETHZ; Wen Li, ETH; Luc Van Gool, KTH\",\r\n3941,Poster,Im2Struct: Recovering 3D Shape Structure from a Single RGB Image,,\"Chengjie Niu, National University of Defense Technology; Jun Li, ; Kai Xu, NUDT & Princeton Univeristy\",\r\n3979,Poster,Bidirecional Retrieval Made Simple,,\"J_atas Wehrmann, PUCRS; Rodrigo Barros, PUCRS\",\r\n3980,Poster,A Twofold Siamese Network for Real-Time Object Tracking,,\"Anfeng He, USTC; Chong Luo, Microsoft Research Asia; Xinmei Tian, USTC; Wenjun Zeng,\",\r\n3981,Poster,Multi-shot Pedestrian Re-identification via Sequential Decision Making,,\"Jianfu Zhang, Shanghai Jiaotong University; Naiyan Wang, tusimple; Liqing Zhang, Shanghai Jiaotong University\",\r\n3989,Poster,Wide Compression: Tensor Ring Nets,,\"Wenqi Wang, Purdue University; YIfan Sun, Technicolor Research; Brian Eriksson, Adobe; Wenlin Wang, Duke University; Vaneet Aggarwal, Purdue University\",\r\n3991,Poster,Improvements to context based self-supervised learning,,\"Terrell Mundhenk, LLNL; Daniel Ho, LLNL; Barry Chen, LLNL\",\r\n3993,Poster,On the convergence of PatchMatch and its variants,,\"Thibaud EHRET, CMLA, ENS Cachan; Pablo Arias, CMLA, ENS Cachan\",\r\n4001,Poster,Fast Monte-Carlo Localization on Aerial Vehicles using Approximate Continuous Belief Representations,,\"Aditya Dhawale, Carnegie Mellon University; Kumar Shaurya Shankar, Carnegie Mellon University; Nathan Michael, Carnegie Mellon University\",\r\n4014,Poster,Learning Multi-Instance Enriched Image Representation via Non-Greedy Simultaneous L1 -Norm Minimization and Maximization,,\"Hua Wang, Colorado School of Mines\",\r\n4023,Poster,Automatic Map Inference from Aerial Images,,\"Favyen Bastani, MIT CSAIL; Songtao He, MIT CSAIL; Mohammad Alizadeh, MIT CSAIL; Hari Balakrishnan, MIT CSAIL; Sam Madden, MIT CSAIL; Sanjay Chawla, Qatar Computing Research Institute; Sofiane Abbar, Qatar Computing Research Institute; David DeWitt, MIT CSAIL\",\r\n4066,Poster,Exploiting Transitivity for Learning Person Re-identification Models on a Budget,,\"Sourya Roy, UC Riverside ; Sujoy Paul, UC Riverside; Neal Young, UC Riverside ; Amit Roy-Chowdhury, UC Riverside\",\r\n4075,Poster,Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display,,\"WENJUAN LIAO, NTU, Singapore\",\r\n4099,Poster,A Robust Generative Framework for Generalized Zero-Shot Learning,,\"Vinay Verma, IIT Kanpur; Gundeep Arora, IIT Kanpur; Ashish Mishra, IIT MADRAS; Piyush Rai, IIT Kanpur\",\r\n4109,Poster,3D Registration of Curves and Surfaces using Local Differential Information,,\"Carolina Raposo, Institute of Systems and Robot; Joao Barreto, University of Coimbra, Portugal\",\r\n4110,Poster,Trust your Model: Light Field Depth Estimation with inline Occlusion Handling,,\"Hendrik Schilling, Universit_ Heidelberg; Maximilian Diebold, Heidelberg University; Carsten Rother, University of Heidelberg; Bernd J_ne, University of Heidelberg\",\r\n4112,Poster,Partially Shared Multi-Task Convolutional Neural Network with Local Constraint for Face Attribute Learning,,\"Jiajiong Cao, ; Yingming Li, Zhejiang University; Zhongfei Zhang,\",\r\n4149,Poster,Learning Structure and Strength of CNN Filters for Small Sample Size Training,,\"Rohit Keshari, IIIT Delhi; Mayank Vatsa, IIIT Dehli; Richa Singh, IIT Dehli; Afzel Noore, WVU\",\r\n4172,Poster,Motion Segmentation by Exploiting Complementary Geometric Models,,\"Xun Xu, National University of Singapore; Loong Fah Cheong, National University of Singapore; Zhuwen Li, Intel Labs\",\r\n4186,Poster,Unsupervised Learning of Single View Depth Estimation and Visual Odometry with Deep Feature Reconstruction,,\"Huangying Zhan, The University of Adelaide; Ravi Garg, The University of Adelaide; Chamara Weerasekera, The University of Adelaide; Kejie Li, The University of Adelaide; Harsh Agarwal, Indian Institute of Technology (BHU); Ian Reid,\",\r\n4208,Poster,GAGAN: Geometry Aware Generative Adverserial Networks,,\"Jean Kossaifi, Imperial College London; Linh Tran, Imperial College London; Yannis Panagakis, ; Maja Pantic, Imperial College London, UK\",\r\n4238,Poster,Learning Representations for Single Cells in Microscopy Images,,\"Juan Caicedo, Broad Institute of Harvard and; Claire Mcquin, Broad Institute of Harvard and MIT; Allen Goodman, Broad Institute of Harvard and MIT; Shantanu Singh, Broad Institute of Harvard and MIT; Anne Carpenter, Broad Institute of Harvard and MIT\",\r\n4239,Poster,\"Estimation of Camera Locations in Highly Corrupted Scenarios: All About the Base, No Shape Trouble\",,\"Yunpeng Shi, University of Minnesota; Gilad Lerman, University of Minnesota\",\r\n4244,Poster,Deep Spatial Feature Reconstruction for Partial Person Re-identification,,\"Lingxiao He, Institute of Automation\u001aChines; Jian Liang, CASIA; Haiqing Li, ; Zhenan Sun, CRIPAC\",\r\n4249,Poster,Cross-Dataset Adaptation for Visual Question Answering,,\"Wei-Lun Chao, USC; Hexiang Hu, ; Fei Sha, University of Southern California\",\r\n4255,Poster,Learning Visual Knowledge Memory Networks for Visual Question Answering,,\"Zhou Su, ; Jianguo Li, Intel Lab; Zhiqiang Shen, Fudan University; Yurong Chen,\",\r\n4265,Poster,Compassionately Conservative Balanced Cuts for Image Segmentation,,\"Nathan Cahill, Rochester Institute of Technol; Tyler Hayes, Rochester Institute of Tech; Renee Meinhold, Rochester Institute of Technology; John Hamilton, RIT\",\r\n4272,Poster,Neural Motifs: Scene Graph Parsing with Global Context,,\"Rowan Zellers, University of Washington; Mark Yatskar, University of Washington; Samuel Thomson, Carnegie Mellon University; Yejin Choi, University of Washington\",\r\n4286,Poster,Alternating-Stereo VINS: Observability Analysis and Performance Evaluation,,\"Mrinal Kanti Paul, Google; Stergios Roumeliotis, Google\"," }, { "path": "cvpr_2019_githublinks.csv", "content": "No,title,github/code,paper reading article,project link,Catorgory,Oral day,Oral session,Oral time,Poster day,Poster session,Poster #,First author,Author\r\n304,A Simple Pooling-Based Design for Real-Time Salient Object Detection,http://mmcheng.net/poolnet/ ,,,,,,,Tuesday,Poster 1.2,163,Jiang-Jiang Liu,\"Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Jiashi Feng, Jianmin Jiang\"\r\n305,DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds,https://ai4ce.github.io/DeepMapping/,,,,Thursday,Oral 3.1A,9:47,Thursday,Poster 3.1,203,Li Ding,\"Li Ding, Chen Feng\"\r\n306,Action Recognition From Single Timestamp Supervision in Untrimmed Videos,https://bitbucket.org/dmoltisanti/action_recognition_single_timestamps/src/master/,,https://dimadamen.github.io/single_timestamps/,,,,,Thursday,Poster 3.1,129,Davide Moltisanti,\"Davide Moltisanti, Sanja Fidler, Dima Damen\"\r\n307,FilterReg: Robust and Efficient Probabilistic Point-Set Registration Using Gaussian Filter and Twist Parameterization,https://bitbucket.org/gaowei19951004/poser/src/master/ ,,,,Thursday,Oral 3.2C,14:34,Thursday,Poster 3.2,158,Wei Gao,\"Wei Gao, Russ Tedrake\"\r\n117,Unsupervised Deep Tracking,https://github.com/594422814/UDT,,,,,,,Tuesday,Poster 1.1,132,Ning Wang,\"Ning Wang, Yibing Song, Chao Ma, Wengang Zhou, Wei Liu, Houqiang Li\"\r\n308,Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence,https://github.com/A2Zadeh/Social-IQ,,,,Thursday,Oral 3.1B,9:52,Thursday,Poster 3.1,91,Amir Zadeh,\r\n1059,All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation,https://github.com/a514514772/DISE-Domain-Invariant-Structure-Extraction,,,,,,,Tuesday,Poster 1.1,192,Wei-Lun Chang,\r\n309,Reflection Removal Using a Dual-Pixel Sensor,https://github.com/abhijithpunnappurath/dprr,,https://abhijithpunnappurath.github.io/dprr,,,,,Tuesday,Poster 1.1,157,Abhijith Punnappurath,\"Abhijith Punnappurath ,Michael S. Brown\"\r\n261,Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-To-Image Translation,https://github.com/aimagelab/art2real,,,,,,,Wednesday,Poster 2.1,161,Matteo Tomei,\"Matteo Tomei, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara\"\r\n150,Latent Space Autoregression for Novelty Detection,https://github.com/aimagelab/novelty-detection,,,,,,,Tuesday,Poster 1.1,25,Davide Abati,\"Davide Abati, Angelo Porrello, Simone Calderara, Rita Cucchiara\"\r\n96,\"Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions\",https://github.com/aimagelab/show-control-and-tell,,,,,,,Wednesday,Poster 2.2,195,Marcella Cornia,\"Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara\"\r\n123,Strike (With) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects,https://github.com/airalcorn2/strike-with-a-pose,,,,,,,Wednesday,Poster 2.1,29,Michael A. Alcorn,\"Michael A. Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, Anh Nguyen\"\r\n310,Learning 3D Human Dynamics From Video,https://github.com/akanazawa/human_dynamics?,,,,,,,Wednesday,Poster 2.1,122,Angjoo Kanazawa,\"Angjoo Kanazawa, Jason Y. Zhang, Panna Felsen, Jitendra Malik\"\r\n311,Video Action Transformer Network,https://github.com/alainray/action_transformer,,,,Tuesday,Oral 1.1C,9:00,Tuesday,Poster 1.1,109,Rohit Girdhar,\"Rohit Girdhar, Jo?o Carreira, Carl Doersch, Andrew Zisserman\"\r\n312,Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction,https://github.com/alexklwong/adareg-monodispnet,,,,,,,Wednesday,Poster 2.1,125,Alex Wong,\"Alex Wong, Byung-Woo Hong, Stefano Soatto\"\r\n81,Animating Arbitrary Objects via Deep Motion Transfer,https://github.com/AliaksandrSiarohin/monkey-net,,,,Tuesday,Oral 1.2B,14:06,Tuesday,Poster 1.2,124,Aliaksandr Siarohin,\"Aliaksandr Siarohin,?Stphane Lathuilire,?Sergey Tulyakov,?Elisa Ricci,?Nicu Sebe\"\r\n194,ELASTIC: Improving CNNs With Dynamic Scaling Policies,https://github.com/allenai/elastic,,,,Tuesday,Oral 1.2A,14:42,Tuesday,Poster 1.2,30,Huiyu Wang,\"Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari\"\r\n129,Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning,https://github.com/allenai/savn,,,,Wednesday,Oral 2.2B,15:10,Wednesday,Poster 2.2,194,Mitchell Wortsman,\"Mitchell Wortsman, Kiana Ehsani, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi\"\r\n636,Argoverse: 3D Tracking and Forecasting With Rich Maps,https://github.com/alliecc/argoverse_baselinetracker,,,,Thursday,Oral 3.1B,9:16,Thursday,Poster 3.1,85,Ming-Fang Chang,\r\n59,Recurrent Back-Projection Network for Video Super-Resolution,https://github.com/alterzero/RBPN-PyTorch,,https://alterzero.github.io/projects/RBPN.html,,,,,Tuesday,Poster 1.2,161,Muhammad Haris,\"Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita?\"\r\n242,Trust Region Based Adversarial Attack on Neural Networks,https://github.com/amirgholami/trattack,,,,,,,Thursday,Poster 3.2,38,Zhewei Yao,\"Zhewei Yao, Amir Gholami, Peng Xu, Kurt Keutzer, Michael Mahoney\"\r\n313,GPSfM: Global Projective SFM Using Algebraic Constraints on Multi-View Fundamental Matrices,https://github.com/amnonge/GPSFM-Code,,,,,,,Tuesday,Poster 1.2,79,Yoni Kasten,Yoni Kasten* Amnon Geifman* Meirav Galun Ronen Basri\r\n314,Depth From a Polarisation + RGB Stereo Pair,https://github.com/AmosZhu/CVPR2019,,,,,,,Wednesday,Poster 2.2,86,Dizhong Zhu,\"Dizhong Zhu, William A.P. Smith\"\r\n315,Improved Road Connectivity by Joint Learning of Orientation and Segmentation,https://github.com/anilbatra2185/road_connectivity,,,,,,,Thursday,Poster 3.1,177,Anil Batra,Anil BatraSuriya SinghGuan PangSaikat BasuC.V. Jawahar1 Manohar Paluri\r\n162,Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval,https://github.com/AnjanDutta/sem-pcyc,,,,,,,Wednesday,Poster 2.1,54,Anjan Dutta,\"Anjan Dutta, Zeynep Akata \"\r\n205,Unsupervised Learning of Action Classes With Continuous Temporal Embedding,https://github.com/annusha/unsup_temp_embed,,,,,,,Thursday,Poster 3.2,129,Anna Kukleva,\"Anna Kukleva, Hilde Kuehne, Fadime Sener, Juergen Gall\"\r\n316,Bringing Alive Blurred Moments,https://github.com/anshulbshah/Blurred-Image-to-Video,,,,Wednesday,Oral 2.2C,14:11,Wednesday,Poster 2.2,137,Kuldeep Purohit,\r\n55,\"Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation\",https://github.com/anuragranj/cc,,,,,,,Thursday,Poster 3.2,165,Anurag Ranjan,\"Anurag Ranjan, Varun Jampani, Lukas Balles, Kihwan Kim, Deqing Sun, Jonas Wulff, Michael J. Black?\"\r\n317,Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation,https://github.com/apple/ml-cvpr2019-swd,,,,,,,Thursday,Poster 3.1,167,Chen-Yu Lee,\"Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, Daniel Ulbricht\"\r\n747,Attentive Feedback Network for Boundary-Aware Salient Object Detection,https://github.com/ArcherFMY/AFNet,,,,,,,Tuesday,Poster 1.1,164,Mengyang Feng,\r\n1224,Viewport Proposal CNN for 360deg Video Quality Assessment,https://github.com/Archer-Tatsu/V-CNN,,,,,,,Thursday,Poster 3.1,156,Chen Li,\r\n318,Zero-Shot Task Transfer,https://github.com/ArghyaPal/Zero-shot-task-transfer,,,,Tuesday,Oral 1.2A,13:58,Tuesday,Poster 1.2,23,Arghya Pal,\r\n282,A-CNN: Annularly Convolutional Neural Networks on Point Clouds,https://github.com/artemkomarichev/a-cnn,,,,,,,Wednesday,Poster 2.2,69,Artem Komarichev,\"Artem Komarichev, Zichun Zhong, Jing Hua\"\r\n319,Automatic Adaptation of Object Detectors to New Domains Using Self-Training,https://github.com/AruniRC/detectron-self-train,,http://vis-www.cs.umass.edu/self-train/,,,,,Tuesday,Poster 1.1,55,Aruni RoyChowdhury,\"Aruni RoyChowdhury, Prithvijit Chakrabarty, Ashish Singh, SouYoung Jin, Huaizu Jiang, Liangliang Cao and Erik Learned-Miller \"\r\n320,PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds,https://github.com/aseembehl/pointflownet ,,,,,,,Wednesday,Poster 2.2,124,Aseem Behl,\"Aseem Behl, Despoina Paschalidou, Simon Donn, Andreas Geiger\"\r\n321,LAEO-Net: Revisiting People Looking at Each Other in Videos,https://github.com/AVAuco/ucolaeodb,,http://www.robots.ox.ac.uk/~vgg/research/laeonet?,,,,,Tuesday,Poster 1.2,100,Manuel J. Marin-Jimenez,Manuel J Marin-JimenezVicky KalogeitonPablo Medina-SuarezAndrew Zisserman\r\n239,Noise-Aware Unsupervised Deep Lidar-Stereo Fusion,https://github.com/AvrilCheng/LidarStereoNet,,,,,,,Wednesday,Poster 2.1,211,Xuelian Cheng,\"Xuelian Cheng, Yiran Zhong, Yuchao Dai, Pan Ji, Hongdong Li\"\r\n322,Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning,https://github.com/AyanKumarBhunia/Handwriting_Recogition_using_Adversarial_Learning?,,,,,,,Wednesday,Poster 2.1,21,Ayan Kumar Bhunia,\"Ayan Kumar Bhunia, Abhirup Das, Ankan Kumar Bhunia, Perla Sai Raj Kishore, Partha Pratim Roy\"\r\n92,Depth-Aware Video Frame Interpolation,https://github.com/baowenbo/DAIN,,,,,,,Tuesday,Poster 1.2,141,Wenbo Bao,\"Wenbo Bao Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang\"\r\n69,GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction,https://github.com/barisgecer/ganfit,,,,,,,Tuesday,Poster 1.1,105,Baris Gecer,\"Baris Gecer, Stylianos Ploumpis, Irene Kotsia, Stefanos Zafeiriou\"\r\n323,RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation,https://github.com/bastianwandt/RepNet,,,,,,,Wednesday,Poster 2.2,106,Bastian Wandt,\"Bastian Wandt, Bodo Rosenhahn\"\r\n324,End-To-End Projector Photometric Compensation,https://github.com/BingyaoHuang/compennet?,,,,Wednesday,Oral 2.2C,13:58,Wednesday,Poster 2.2,135,Bingyao Huang,\"Bingyao Huang, Haibin Ling\"\r\n185,A Poisson-Gaussian Denoising Dataset With Real Fluorescence Microscopy Images,https://github.com/bmmi/denoising-fluorescence,,,,,,,Thursday,Poster 3.2,75,Yide Zhang,\"Yide Zhang, Yinhao Zhu, Evan Nichols, Qingfei Wang, Siyuan Zhang, Cody Smith, Scott Howard\"\r\n184,Deep RNN Framework for Visual Sequential Applications,https://github.com/BoPang1996/Deep-RNN-Framework,,,,,,,Tuesday,Poster 1.1,19,Bo Pang,\"Bo Pang, Kaiwen Zha, Hanwen Cao, Chen Shi, Cewu Lu\"\r\n325,3D Hand Shape and Pose From Images in the Wild,https://github.com/boukhayma/3dhand,,,,Thursday,Oral 3.2B,13:53,Thursday,Poster 3.2,96,Adnane Boukhayma,\"Adnane Boukhayma, Rodrigo de Bem, Philip H.S. Torr\"\r\n151,DeepCaps: Going Deeper With Capsule Networks,https://github.com/brjathu/deepcaps,,,,Thursday,Oral 3.2A,14:29,Thursday,Poster 3.2,11,Jathushan Rajasegaran,\"Jathushan Rajasegaran, Vinoj Jayasundara, Sandaru Jayasekara, Hirunima Jayasekara, Suranga Seneviratne, Ranga Rodrigo\"\r\n84,MUREL: Multimodal Relational Reasoning for Visual Question Answering,https://github.com/Cadene/murel.bootstrap.pytorch,https://zhuanlan.zhihu.com/p/60972299,,,,,,Tuesday,Poster 1.1,201,Remi Cadene,\"Remi Cadene, Hedi Ben-younes, Matthieu Cord, Nicolas Thome\"\r\n326,Pyramid Feature Attention Network for Saliency Detection,https://github.com/CaitinZhao/cvpr2019_Pyramid-Feature-Attention-Network-for-Saliency-detection,,,,,,,Tuesday,Poster 1.2,61,Ting Zhao,\"Ting Zhao, Xiangqian Wu\"\r\n902,\"See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks\",https://github.com/carrierlxk/COSNet,,,,,,,Tuesday,Poster 1.2,115,Xiankai Lu,\r\n327,\"Zoom to Learn, Learn to Zoom\",https://github.com/ceciliavision/zoom-learn-zoom,,https://ceciliavision.github.io/project-pages/project-zoom.html?,,,,,Tuesday,Poster 1.2,147,Xuaner Zhang,\"Xuaner Cecilia Zhang, Qifeng Chen, Ren Ng, Vladlen Koltun \"\r\n206,Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network,https://github.com/chaneyddtt/Generating-Multiple-Hypotheses-for-3D-Human-Pose-Estimation-with-Mixture-Density-Network,,,,,,,Thursday,Poster 3.1,126,Chen Li,\"Chen Li, Gim Hee Lee\"\r\n328,Hybrid-Attention Based Decoupled Metric Learning for Zero-Shot Image Retrieval,https://github.com/chenbinghui1/Hybrid-Attention-based-Decoupled-Metric-Learning?,,,,,,,Tuesday,Poster 1.2,9,Binghui Chen,\"Binghui Chen, Weihong Deng\"\r\n82,Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction,https://github.com/chenhsuanlin/photometric-mesh-optim,,https://chenhsuanlin.bitbucket.io/photometric-mesh-optim/,,,,,Tuesday,Poster 1.1,86,Chen-Hsuan Lin,\"Pelin Dogan, Leonid Sigal, Markus Gross\"\r\n118,Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images,https://github.com/chenwydj/ultra_high_resolution_segmentation,,,,Thursday,Oral 3.1C,9:34,Thursday,Poster 3.1,66,Wuyang Chen,\"Wuyang Chen, Ziyu Jiang, Zhangyang Wang, Kexin Cui, Xiaoning Qian\"\r\n329,Fully Automatic Video Colorization With Self-Regularization and Diversity,https://github.com/ChenyangLEI/Fully-Automatic-Video-Colorization-with-Self-Regularization-and-Diversity,,,,,,,Tuesday,Poster 1.2,146,Chenyang Lei,\r\n330,Multi-Label Image Recognition With Graph Convolutional Networks,https://github.com/chenzhaomin123/ML_GCN,,,,,,,Wednesday,Poster 2.1,63,Zhao-Min Chen,\"Zhao-Min Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo\"\r\n80,Learning Attraction Field Representation for Robust Line Segment Detection,https://github.com/cherubicXN/afm_cvpr2019,,,,,,,Tuesday,Poster 1.1,161,Nan Xue,\"Nan Xue,?Song Bai,?Fudong Wang,?Gui-Song Xia,?Tianfu Wu,?Liangpei Zhang\"\r\n101,The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation,https://github.com/chihyaoma/regretful-agent,,,,Wednesday,Oral 2.2B,15:00,Wednesday,Poster 2.2,192,Chih-Yao Ma,\"Chih-Yao Ma, Zuxuan Wu, Ghassan AlRegib, Caiming Xiong, Zsolt Kira\"\r\n262,Iterative Residual CNNs for Burst Photography Applications,https://github.com/cig-skoltech/burst-cvpr-2019,,,,,,,Wednesday,Poster 2.1,169,Filippos Kokkinos,Filippos Kokkinos ? Stamatis Lefkimmiatis \r\n119,Improving Transferability of Adversarial Examples With Input Diversity,https://github.com/cihangxie/DI-2-FGSM,,,,,,,Tuesday,Poster 1.2,7,Cihang Xie,Cihang Xie; Yuyin Zhou; Song Bai; Zhishuai Zhang; Jianyu Wang; Zhou Ren; Alan Yuille\r\n132,TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments,https://github.com/clic-lab/touchdown,,,,,,,Thursday,Poster 3.2,195,Howard Chen,\"Howard Chen, Alane Suhr, Dipendra Misra, Noah Snavely, Yoav Artzi\"\r\n679,Character Region Awareness for Text Detection,https://github.com/clovaai/CRAFT-pytorch,,,,,,,Thursday,Poster 3.1,43,Youngmin Baek,\r\n563,\"Monocular Total Capture: Posing Face, Body, and Hands in the Wild\",https://github.com/CMU-Perceptual-Computing-Lab/MonocularTotalCapture,,http://domedb.perception.cs.cmu.edu/monototalcapture.html,,Thursday,Oral 3.2B,15:05,Thursday,Poster 3.2,108,Donglai Xiang,\"Donglai Xiang, Hanbyul Joo, Yaser Sheikh\"\r\n68,Panoptic Segmentation,https://github.com/cocodataset/panopticapi,,,,,,,Thursday,Poster 3.1,47,Alexander Kirillov,\"Huanyu Liu, Chao Peng, Changqian Yu, Jingbo Wang, Xu Liu, Gang Yu, Wei Jiang\"\r\n331,Towards Natural and Accurate Future Motion Prediction of Humans and Animals,https://github.com/CoderStellaJ/Hierarchical-Motion-Recurrent-Network-Website,https://coderstellaj.github.io/Hierarchical-Motion-Recurrent-Network-Website/,,,,,,Thursday,Poster 3.1,138,Zhenguang Liu,Zhenguang Liu (Zhejiang Gongshang University)*; Shuang Wu (Nanyang Technological University); Shuyuan Jin (NUS); Qi Liu (National University of Singapore); Shijian Lu (Nanyang Technological University); Roger Zimmermann (NUS); Li Cheng (University of Alberta)\r\n1067,Deep Defocus Map Estimation Using Domain Adaptation,https://github.com/codeslake/SYNDOF,,,,,,,Thursday,Poster 3.2,145,Junyong Lee,\r\n332,COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis,https://github.com/coin-dataset/code,https://mp.weixin.qq.com/s/YFmgdIEfLCvL5LLGQ0vesw,https://coin-dataset.github.io/,,,,,Tuesday,Poster 1.1,122,Yansong Tang,\"Yansong Tang, Dajun Ding, Yongming Rao, Yu Zheng, Danyang Zhang, Lili Zhao, Jiwen Lu, Jie Zhou\"\r\n333,Context-Aware Crowd Counting,https://github.com/CommissarMa/Context-Aware_Crowd_Counting-pytorch,,,,,,,Wednesday,Poster 2.1,55,Weizhe Liu,\"Weizhe Liu, Mathieu Salzmann, Pascal Fua\"\r\n334,Acoustic Non-Line-Of-Sight Imaging,https://github.com/computational-imaging/AcousticNLOS,,http://www.computationalimaging.org/publications/acoustic-non-line-of-sight-imaging/,,Wednesday,Oral 2.2C,13:40,Wednesday,Poster 2.2,132,David B. Lindell,\"David B. Lindell, Gordon Wetzstein, Vladlen Koltun\"\r\n335,Divide and Conquer the Embedding Space for Metric Learning,https://github.com/CompVis/metric-learning-divide-and-conquer,,,,,,,Tuesday,Poster 1.1,24,Artsiom Sanakoyeu,\"Sanakoyeu, Artsiom and Tschernezki, Vadim and B\\\"\"uchler, Uta and Ommer, Bj\\\"\"orn\"\r\n1277,A Neural Temporal Model for Human Motion Prediction,https://github.com/cr7anand/neural_temporal_models,,,,,,,Thursday,Poster 3.2,134,Anand Gopalakrishnan,\r\n130,Progressive Image Deraining Networks: A Better and Simpler Baseline,https://github.com/csdwren/PReNet,,,,,,,Tuesday,Poster 1.2,165,Dongwei Ren,\"Dongwei Ren, Wangmeng Zuo, Qinghua Hu, Pengfei Zhu, Deyu Meng\"\r\n186,Supervised Fitting of Geometric Primitives to 3D Point Clouds,https://github.com/csimstu2/SPFN,,,,Tuesday,Oral 1.2C,15:05,Tuesday,Poster 1.2,182,Lingxiao Li,\"Lingxiao Li, Minhyuk Sung, Anastasia Dubrovina, Li Yi, Leonidas Guibas\"\r\n336,STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing,https://github.com/csmliu/STGAN?,,,,,,,Tuesday,Poster 1.2,138,Ming Liu,\"Ming Liu, Yukang Ding, Min Xia, Xiao Liu, Errui Ding, Wangmeng Zuo, Shilei Wen\"\r\n28,Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels,https://github.com/cszn/DPSR,,,,,,,Tuesday,Poster 1.1,169,Kai Zhang,\"Kai Zhang, Wangmeng Zuo, Lei Zhang\"\r\n192,Segmentation-Driven 6D Object Pose Estimation,https://github.com/cvlab-epfl/segmentation-driven-pose,,,,,,,Tuesday,Poster 1.2,91,Yinlin Hu,\"Yinlin Hu, Joachim Hugonot, Pascal Fua, Mathieu Salzmann\"\r\n179,Learning to Adapt for Stereo,https://github.com/CVLAB-Unibo/Learning2AdaptForStereo,,,,,,,Thursday,Poster 3.1,103,Alessio Tonioni,\"Alessio Tonioni, Oscar Rahnama, Thomas Joy, Luigi Di Stefano, Thalaiyasingam Ajanthan, Philip H. S. Torr\"\r\n76,Real-Time Self-Adaptive Deep Stereo,https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo,,,,Tuesday,Oral 1.1B,9:41,Tuesday,Poster 1.1,79,Alessio Tonioni,\"Alessio Tonioni, Fabio Tosi, Matteo Poggi, Stefano Mattoccia, Luigi Di Stefano?\"\r\n337,Example-Guided Style-Consistent Image Synthesis From Semantic Labeling,https://github.com/cxjyxxme/pix2pixSC,,,,,,,Tuesday,Poster 1.1,151,Miao Wang,\"Miao Wang1, Guo-Ye Yang2, Ruilong Li2, Run-Ze Liang2, Song-Hai Zhang2, Peter M. Hall3?and Shi-Min Hu\"\r\n62,Rethinking Knowledge Graph Propagation for Zero-Shot Learning,https://github.com/cyvius96/adgpm,,,,,,,Thursday,Poster 3.2,52,Michael Kampffmeyer,\"Michael Kampffmeyer, Yinbo Chen, Xiaodan Liang, Hao Wang, Yujia Zhang, Eric P. Xing\"\r\n56,Learning Implicit Fields for Generative Shape Modeling,https://github.com/czq142857/implicit-decoder?,,,,,,,Wednesday,Poster 2.1,170,Zhiqin Chen,\"Zhiqin Chen, Hao Zhang\"\r\n14,PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding,https://github.com/daerduoCarey/partnet_dataset,,https://cs.stanford.edu/~kaichun/partnet/,,,,,Tuesday,Poster 1.1,68,Kaichun Mo,\"Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su\"\r\n338,Visual Tracking via Adaptive Spatially-Regularized Correlation Filters,https://github.com/Daikenan/ASRCF ,,,,Wednesday,Oral 2.1C,9:29,Wednesday,Poster 2.1,145,Kenan Dai,\"Kenan Dai, Dong Wang, Huchuan Lu, Chong Sun, Jianhua Li\"\r\n339,Second-Order Attention Network for Single Image Super-Resolution,https://github.com/daitao/SAN,,,,Thursday,Oral 3.2C,14:16,Thursday,Poster 3.2,155,Tao Dai,\r\n340,Dual Encoding for Zero-Example Video Retrieval,https://github.com/danieljf24/dual_encoding,,,,,,,Thursday,Poster 3.1,41,Jianfeng Dong,\"Jianfeng Dong, Xirong Li, Chaoxi Xu, Shouling Ji, Yuan He, Gang Yang and Xun Wang\"\r\n341,MAGSAC: Marginalizing Sample Consensus,https://github.com/danini/magsac,,,,,,,Thursday,Poster 3.1,158,Daniel Barath,\r\n957,Disentangling Adversarial Robustness and Generalization,https://github.com/davidstutz/cvpr2019-adversarial-robustness,,,,,,,Wednesday,Poster 2.2,5,David Stutz,\r\n512,Few-Shot Learning With Localization in Realistic Settings,https://github.com/daviswer/fewshotlocal,,,,Wednesday,Oral 2.2A,15:05,Wednesday,Poster 2.2,41,Davis Wertheimer,\r\n231,Vision-Based Navigation With Language-Based Assistance via Imitation Learning With Indirect Intervention,https://github.com/debadeepta/vnla,,,,,,,Thursday,Poster 3.2,194,Khanh Nguyen,\"Khanh Nguyen, Debadeepta Dey, Chris Brockett, Bill Dolan\"\r\n15,ABC: A Big CAD Model Dataset for Geometric Deep Learning,https://github.com/deep-geometry/abc-dataset?,,,,,,,Thursday,Poster 3.1,97,Sebastian Koch,\"Sebastian Koch, Albert Matveev, Zhongshi Jiang, Francis Williams, Alexey Artemov, Evgeny Burnaev, Marc Alexa, Denis Zorin, Daniele Panozzo\"\r\n4,ArcFace: Additive Angular Margin Loss for Deep Face Recognition,https://github.com/deepinsight/insightface,,,,Wednesday,Oral 2.1C,9:42,Wednesday,Poster 2.1,147,Jiankang Deng,\"Jiankang Deng, Jia Guo, Niannan Xue, Stefanos Zafeiriou\"\r\n342,Exploiting Temporal Context for 3D Human Pose Estimation in the Wild,https://github.com/deepmind/Temporal-3D-Pose-Kinetics?,,,,,,,Tuesday,Poster 1.2,92,Anurag Arnab,\"Anurag Arnab, Carl Doersch, Andrew Zisserman\"\r\n343,Shifting More Attention to Video Salient Object Detection,https://github.com/DengPingFan/DAVSOD,,,,Thursday,Oral 3.1A,8:48,Thursday,Poster 3.1,193,Deng-Ping Fan,\"Deng-Ping Fan, Wenguan Wang, Ming-Ming Cheng, Jianbing Shen\"\r\n144,Transferable Interactiveness Knowledge for Human-Object Interaction Detection,https://github.com/DirtyHarryLYL/Transferable-Interactiveness-Network,,,,,,,Tuesday,Poster 1.2,111,Yong-Lu Li,\"Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yan-Feng Wang, Cewu Lu\"\r\n293,Beyond Gradient Descent for Regularized Segmentation Losses,https://github.com/dmitrii-marin/adm-seg,,,,,,,Thursday,Poster 3.1,157,Dmitrii Marin,\"Dmitrii Marin, Meng Tang, Ismail Ben Ayed, Yuri Boykov\"\r\n6,Bag of Tricks for Image Classification with Convolutional Neural Networks,https://github.com/dmlc/gluon-cv,,,,,,,Tuesday,Poster 1.1,33,Tong He,\"Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li\"\r\n251,Cross-Task Weakly Supervised Learning From Instructional Videos,https://github.com/DmZhukov/CrossTask,,,,,,,Tuesday,Poster 1.2,106,Dimitri Zhukov,\"Dimitri Zhukov, Jean-Baptiste Alayrac, Ramazan Gokberk Cinbis, David Fouhey, Ivan Laptev, Josef Sivic\"\r\n207,LiFF: Light Field Features in Scale and Depth,https://github.com/doda42/LiFF,,,,,,,Wednesday,Poster 2.2,150,Donald G. Dansereau,\"Donald G. Dansereau, Bernd Girod, Gordon Wetzstein\"\r\n344,Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning,https://github.com/Dong-JinKim/DenseRelationalCaptioning,,,,,,,Wednesday,Poster 2.1,204,Dong-Jin Kim,\"Done by Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, and In So Kweon\"\r\n345,Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells,https://github.com/DrSleep/nas-segm-pytorch,,,,,,,Thursday,Poster 3.1,18,Vladimir Nekrasov,\"Vladimir Nekrasov, Hao Chen, Chunhua Shen, Ian Reid\"\r\n346,OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks,https://github.com/dsfour/OICSR,,,,,,,Wednesday,Poster 2.2,12,Jiashi Li,\"Jiashi Li, Qi Qi, Jingyu Wang, Ce Ge, Yujian Li, Zhangzhang Yue, Haifeng Sun\"\r\n347,CrossInfoNet: Multi-Task Information Sharing Based Hand Pose Estimation,https://github.com/dumyy/handpose ,,,,,,,Thursday,Poster 3.1,127,Kuo Du,\"Kuo Du, Xiangbo Lin, Yi Sun, Xiaohong Ma\"\r\n348,Searching for a Robust Neural Architecture in Four GPU Hours,https://github.com/D-X-Y/GDAS?,,,,,,,Tuesday,Poster 1.1,178,Xuanyi Dong,\"Xuanyi Dong, Yi Yang\"\r\n52,PointConv: Deep Convolutional Networks on 3D Point Clouds,https://github.com/DylanWusee/pointconv,,,,,,,Thursday,Poster 3.1,99,Wenxuan Wu,\"Wenxuan Wu, Zhongang Qi, Li Fuxin\"\r\n163,Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders,https://github.com/edgarschnfld/CADA-VAE-PyTorch,,,,,,,Wednesday,Poster 2.2,171,Edgar Schonfeld,\"Edgar Sch?nfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, Zeynep Akata\"\r\n43,Precise Detection in Densely Packed Scenes,https://github.com/eg4000/SKU110K_CVPR19,,,,,,,Wednesday,Poster 2.1,68,Eran Goldman,\"Eran Goldman , Roei Herzig, Aviv Eisenschtat, Jacob Goldberger, Tal Hassner?\"\r\n349,Mixture Density Generative Adversarial Networks,https://github.com/eghbalz/mdgan,,,,,,,Wednesday,Poster 2.1,158,Hamid Eghbal-zadeh,\"Hamid Eghbal-zadeh, Werner Zellinger, Gerhard Widmer\"\r\n294,Exploring the Bounds of the Utility of Context for Object Detection,https://github.com/EhudBarnea/ContextAnalysis,,,,,,,Wednesday,Poster 2.2,68,Ehud Barnea,\"Ehud Barnea, Ohad Ben-Shahar\"\r\n23,Data-Driven Neuron Allocation for Scale Aggregation Networks,https://github.com/Eli-YiLi/ScaleNet,,,,,,,Thursday,Poster 3.2,56,Yi Li,\"Yi Li, Zhanghui Kuang, Yimin Chen, Wayne Zhang\"\r\n706,GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud,https://github.com/ericyi/GSPN,,,,,,,Tuesday,Poster 1.2,184,Li Yi,\r\n252,NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction,https://github.com/ethanygao/NDDR-CNN,,,,,,,Tuesday,Poster 1.2,73,Yuan Gao,\"Yuan Gao, Jiayi Ma, Mingbo Zhao, Wei Liu, Alan L. Yuille\"\r\n232,DLOW: Domain Flow for Adaptation and Generalization,https://github.com/ETHRuiGong/DLOW,,,,Tuesday,Oral 1.2B,15:05,Tuesday,Poster 1.2,134,Rui Gong,\"Rui Gong, Wen Li, Yuhua Chen, Luc Van Gool\"\r\n145,From Coarse to Fine: Robust Hierarchical Localization at Large Scale,https://github.com/ethz-asl/hf_net,,,,,,,Thursday,Poster 3.2,213,Paul-Edouard Sarlin,\"Paul-Edouard Sarlin, Cesar Cadena, Roland Siegwart, Marcin Dymczyk\"\r\n269,A Local Block Coordinate Descent Algorithm for the CSC Model,https://github.com/EvZissel/LoBCoD,,,,,,,Wednesday,Poster 2.2,167,Ev Zisselman,\"Ev Zisselman, Jeremias Sulam, Michael Elad\"\r\n137,Learning Monocular Depth Estimation Infusing Traditional Stereo Knowledge,https://github.com/fabiotosi92/monoResMatch-Tensorflow,,,,,,,Thursday,Poster 3.1,117,Fabio Tosi,\"Fabio Tosi, Filippo Aleotti, Matteo Poggi, Stefano Mattoccia\"\r\n350,Practical Full Resolution Learned Lossless Image Compression,https://github.com/fab-jul/L3C-PyTorch,,,,Thursday,Oral 3.2A,13:30,Thursday,Poster 3.2,1,Fabian Mentzer,\"Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, Luc Van Gool\"\r\n1291,Attentive Single-Tasking of Multiple Tasks,https://github.com/facebookresearch/astmt,,,,,,,Tuesday,Poster 1.1,187,Kevis-Kokitsi Maninis,\r\n233,2.5D Visual Sound,https://github.com/facebookresearch/FAIR-Play,,http://vision.cs.utexas.edu/projects/2.5D_visual_sound/,,Tuesday,Oral 1.1C,9:46,Tuesday,Poster 1.1,117,Ruohan Gao,\"Ruohan Gao, Kristen Grauman\"\r\n172,Grounded Video Description,https://github.com/facebookresearch/grounded-video-description,,,,Wednesday,Oral 2.2B,13:30,Wednesday,Poster 2.2,177,Luowei Zhou,\"Luowei Zhou, Yannis Kalantidis, Xinlei Chen, Jason J. Corso, Marcus Rohrbach\"\r\n30,Feature Denoising for Improving Adversarial Robustness,https://github.com/facebookresearch/ImageNet-Adversarial-Training,,,,,,,Tuesday,Poster 1.1,27,Cihang Xie,\"Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan Yuille, Kaiming He\"\r\n351,Inverse Cooking: Recipe Generation From Food Images,https://github.com/facebookresearch/inversecooking,,,,,,,Thursday,Poster 3.1,184,Amaia Salvador,\"Amaia Salvador, Michal Drozdzal, Xavier Giro-i-Nieto, Adriana Romero\"\r\n7,Towards VQA Models That Can Read,https://github.com/facebookresearch/pythia,,,,,,,Wednesday,Poster 2.2,196,Amanpreet Singh,\"Amanpreet Singh, Vivek Natarajan, Meet Shah, Yu Jiang, Xinlei Chen, Dhruv Batra, Devi Parikh, Marcus Rohrbach\"\r\n19,3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training,https://github.com/facebookresearch/VideoPose3D,,,,,,,Wednesday,Poster 2.2,103,Dario Pavllo,\"Dario Pavllo, Christoph Feichtenhofer, David Grangier, Michael Auli\"\r\n234,Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering,https://github.com/fanchenyou/HME-VideoQA,,,,,,,Tuesday,Poster 1.1,202,Chenyou Fan,\"Chenyou Fan, Xiaofan Zhang, Shu Zhang, Wensheng Wang, Chi Zhang, Heng Huang\"\r\n352,MVF-Net: Multi-View 3D Face Morphable Model Regression,https://github.com/Fanziapril/mvfnet,,,,,,,Tuesday,Poster 1.1,85,Fanzi Wu,\"Fanzi Wu, Linchao Bao, Yajing Chen, Yonggen Ling, Yibing Song, Songnan Li, King Ngi Ngan, Wei Liu\"\r\n283,Learning Not to Learn: Training Deep Neural Networks With Biased Data,https://github.com/feidfoe/learning-not-to-learn,,,,,,,Thursday,Poster 3.1,6,Byungju Kim,\"Byungju Kim, Hyunwoo Kim, Kyungsu Kim, Sungjin Kim, Junmo Kim\"\r\n134,GA-Net: Guided Aggregation Net for End-To-End Stereo Matching,https://github.com/feihuzhang/GANet,,,,Tuesday,Oral 1.1B,9:36,Tuesday,Poster 1.1,78,Feihu Zhang,\"Feihu Zhang, Victor Prisacariu, Ruigang Yang, Philip H.S. Torr\"\r\n526,Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry,https://github.com/feixue94/BeyondTrcking,,,,Thursday,Oral 3.1A,8:58,Thursday,Poster 3.1,195,Fei Xue,\r\n353,Unsupervised Image Captioning,https://github.com/fengyang0317/unsupervised_captioning?,,,,,,,Tuesday,Poster 1.2,202,Yang Feng,\"Yang Feng, Lin Ma, Wei Liu, Jiebo Luo\"\r\n21,Fast Interactive Object Annotation With Curve-GCN,https://github.com/fidler-lab/curve-gcn,,,,,,,Wednesday,Poster 2.1,71,Huan Ling,\"Huan Ling, Jun Gao, Amlan Kar, Wenzheng Chen, Sanja Fidler?\"\r\n1105,Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video,https://github.com/fjchange/object_centric_VAD,,,,,,,Wednesday,Poster 2.2,112,Radu Tudor Ionescu,\r\n817,Frame-Consistent Recurrent Video Deraining With Dual-Level Flow,https://github.com/flyywh/Dual-FLow-Video-Deraining-CVPR-2019,,,,,,,Tuesday,Poster 1.1,168,Wenhan Yang,\r\n216,Domain Generalization by Solving Jigsaw Puzzles,https://github.com/fmcarlucci/JigenDG,,,,Tuesday,Oral 1.2A,14:24,Tuesday,Poster 1.2,27,Fabio M. Carlucci,\"Fabio Maria Carlucci, Antonio D'Innocente, Silvia Bucci, Barbara Caputo, Tatiana Tommasi \"\r\n960,ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification,https://github.com/fnzhan/ESIR,,,,,,,Tuesday,Poster 1.1,208,Fangneng Zhan,\r\n354,Spatial Fusion GAN for Image Synthesis,https://github.com/fnzhan/SF-GAN,,,,,,,Tuesday,Poster 1.2,136,Fangneng Zhan,\r\n355,PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation,https://github.com/FoggYu/PartNet,,,,,,,Thursday,Poster 3.1,71,Fenggen Yu,\"Fenggen Yu, Kun Liu, Yan Zhang, Chenyang Zhu, Kai Xu\"\r\n9,Fast Online Object Tracking and Segmentation: A Unifying Approach,https://github.com/foolwood/SiamMask,,http://www.robots.ox.ac.uk/~qwang/SiamMask/,,,,,Tuesday,Poster 1.1,134,Qiang Wang,\"Qiang Wang,?Li Zhang,?Luca Bertinetto,?Weiming Hu,?Philip H.S. Torr\"\r\n570,Revealing Scenes by Inverting Structure From Motion Reconstructions,https://github.com/francescopittaluga/invsfm,,,,Tuesday,Oral 1.1B,9:10,Tuesday,Poster 1.1,74,Francesco Pittaluga,\r\n243,\"Good News, Everyone! Context Driven Entity-Aware Captioning for News Images\",https://github.com/furkanbiten/GoodNews,,,,,,,Thursday,Poster 3.2,188,Ali Furkan Biten,\"Ali Furkan Biten, Lluis Gomez, Mar?al Rusi?ol, Dimosthenis Karatzas\"\r\n356,Deep Geometric Prior for Surface Reconstruction,https://github.com/fwilliams/deep-geometric-prior,,,,,,,Thursday,Poster 3.1,151,Francis Williams,\"Francis Williams, Teseo Schneider, Claudio Silva, Denis Zorin, Joan Bruna, Daniele Panozzo\"\r\n240,Kervolutional Neural Networks,https://github.com/gan3sh500/kervolution-pytorch,,,,Tuesday,Oral 1.1A,9:18,Tuesday,Poster 1.1,4,Chen Wang,\"Chen Wang, Jianfei Yang, Lihua Xie, Junsong Yuan\"\r\n357,Graphonomy: Universal Human Parsing via Graph Transfer Learning,https://github.com/Gaoyiminggithub/Graphonomy,,,,,,,Wednesday,Poster 2.2,72,Ke Gong,\"Ke Gong, Yiming Gao, Xiaodan Liang, Xiaohui Shen, Meng Wang, Liang Lin\"\r\n358,Pedestrian Detection With Autoregressive Network Phases,https://github.com/garrickbrazil/AR-Ped,,,,,,,Wednesday,Poster 2.2,49,Garrick Brazil,\"Garrick Brazil, Xiaoming Liu\"\r\n359,Semantic Graph Convolutional Networks for 3D Human Pose Regression,https://github.com/garyzhao/SemGCN,,,,,,,Tuesday,Poster 1.2,95,Long Zhao,\r\n270,Instance-Level Meta Normalization,https://github.com/Gasoonjia/ILM-Norm,,,,,,,Wednesday,Poster 2.1,31,Songhao Jia,\"Songhao Jia, Ding-Jie Chen, Hwann-Tzong Chen\"\r\n360,3D Hand Shape and Pose Estimation From a Single RGB Image,https://github.com/geliuhao/3DHandShapePosefromRGB,,https://sites.google.com/site/geliuhaontu/home/cvpr2019,,Thursday,Oral 3.2B,13:48,Thursday,Poster 3.2,95,Liuhao Ge,\"Liuhao Ge, Zhou Ren, Yuncheng Li, Zehao Xue, Yingying Wang, Jianfei Cai, Junsong Yuan\"\r\n361,Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression,https://github.com/generalized-iou/Detectron.pytorch,https://mp.weixin.qq.com/s/6QsyYtEVjavoLfU_lQF1pw,,,,,,Tuesday,Poster 1.1,43,Hamid Rezatofighi,\"Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, Silvio Savarese\"\r\n362,Hierarchical Deep Stereo Matching on High-Resolution Images,https://github.com/gengshan-y/high-res-stereo,,http://www.contrib.andrew.cmu.edu/~gengshay/cvpr19stereo,,,,,Wednesday,Poster 2.1,97,Gengshan Yang,Gengshan YangJoshua Manela Michael HappoldDeva Ramanan\r\n164,Generating Classification Weights With GNN Denoising Autoencoders for Few-Shot Learning,https://github.com/gidariss/wDAE_GNN_FewShot,,,,Tuesday,Oral 1.1A,9:10,Tuesday,Poster 1.1,3,Spyros Gidaris,\"Spyros Gidaris, Nikos Komodakis\"\r\n363,Learning From Synthetic Data for Crowd Counting in the Wild,https://github.com/gjy3035/GCC-CL,,,,,,,Wednesday,Poster 2.2,166,Qi Wang,\"Qi Wang, Junyu Gao, Wei Lin, Yuan Yuan\"\r\n364,Context and Attribute Grounded Dense Captioning,https://github.com/gjyin91/CAG-Net,https://zhuanlan.zhihu.com/p/57822460,https://gjyin91.github.io/projects/cagnet.html,,,,,Wednesday,Poster 2.1,201,Guojun Yin,\"Guojun Yin, Lu Sheng, Bin Liu, Nenghai Yu, Xiaogang Wang, Jing Shao\"\r\n13,Self-Supervised GANs via Auxiliary Rotation Loss,https://github.com/google/compare_gan,,,,,,,Thursday,Poster 3.2,138,Ting Chen,\"Ting Chen, Xiaohua Zhai, Marvin Ritter, Mario Lucic, Neil Houlsby\"\r\n365,Neural Rerendering in the Wild,https://github.com/google/neural_rerendering_in_the_wild,,,,Wednesday,Oral 2.2C,14:42,Wednesday,Poster 2.2,142,Moustafa Meshry,\"Moustafa Meshry, Dan B Goldman, Sameh Khamis, Hugues Hoppe, Rohit Pandey, Noah Snavely, Ricardo Martin-Brualla\"\r\n146,Peeking Into the Future: Predicting Future Person Activities and Locations in Videos,https://github.com/google/next-prediction,,,,,,,Wednesday,Poster 2.1,133,Junwei Liang,\"Junwei Liang, Lu Jiang, Juan Carlos Niebles, Alexander Hauptmann, Li Fei-Fei \"\r\n79,Revisiting Self-Supervised Visual Representation Learning,https://github.com/google/revisiting-self-supervised,,,,,,,Tuesday,Poster 1.1,194,Alexander Kolesnikov,\"Alexander Kolesnikov,?Xiaohua Zhai,?Lucas Beyer\"\r\n195,BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames,https://github.com/griffbr/BubbleNets,,,,Thursday,Oral 3.1C,9:29,Thursday,Poster 3.1,65,Brent A. Griffin,\"Brent A. Griffin, Jason J. Corso\"\r\n366,Attentive Region Embedding Network for Zero-Shot Learning,https://github.com/gsx0/Attentive-Region-Embedding-Network-for-Zero-shot-Learning,,,,,,,Thursday,Poster 3.1,45,Guo-Sen Xie,\r\n367,Self-Calibrating Deep Photometric Stereo Networks,https://github.com/guanyingc/SDPS-Net,,http://gychen.org/SDPS-Net/,,Thursday,Oral 3.1B,9:11,Thursday,Poster 3.1,84,Guanying Chen,\"Guanying Chen, Kai Han, Boxin Shi, Yasuyuki Matsushita, Kwan-Yee K. Wong\"\r\n208,Object Counting and Instance Segmentation With Image-Level Supervision,https://github.com/GuoleiSun/CountSeg,,,,,,,Thursday,Poster 3.2,181,Hisham Cholakkal,\"Hisham Cholakkal, Guolei Sun (equal contribution), Fahad Shahbaz Khan, Ling Shao\"\r\n110,DVC: An End-To-End Deep Video Compression Framework,https://github.com/GuoLusjtu/DVC,,,,Thursday,Oral 3.2C,13:40,Thursday,Poster 3.2,149,Guo Lu,\"Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, Zhiyong Gao \"\r\n54,Toward Convolutional Blind Denoising of Real Photographs,https://github.com/GuoShi28/CBDNet,,,,,,,Tuesday,Poster 1.1,173,Shi Guo,\"Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, Lei Zhang\"\r\n103,Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation,https://github.com/Ha0Tang/SelectionGAN,,,,Tuesday,Oral 1.2B,14:29,Tuesday,Poster 1.2,128,Hao Tang,\"Hao Tang, Dan Xu, Nicu Sebe, Yanzhi Wang, Jason J. Corso, Yan Yan\"\r\n987,AdaptiveFace: Adaptive Margin and Sampling for Face Recognition,https://github.com/haoliu1994/AdaptiveFace,,,,,,,Thursday,Poster 3.2,117,Hao Liu,\r\n368,Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding,https://github.com/hassanhub/MultiGrounding,,,,,,,Thursday,Poster 3.2,189,Hassan Akbari,\"Hassan Akbari, Svebor Karaman, Surabhi Bhargava, Brian Chen, Carl Vondrick, Shih-Fu Chang\"\r\n369,Learning Joint Reconstruction of Hands and Manipulated Objects,https://github.com/hassony2/manopth,,https://www.di.ens.fr/willow/research/obman/,,,,,Thursday,Poster 3.2,85,Yana Hasson,\"Yana Hasson, Gl Varol, Dimitrios Tzionas, Igor Kalevatykh, Michael J. Black, Ivan Laptev, Cordelia Schmid\"\r\n253,The Pros and Cons: Rank-Aware Temporal Attention for Skill Determination in Long Videos,https://github.com/hazeld/rank-aware-attention-network,,,,,,,Wednesday,Poster 2.2,114,Hazel Doughty,\"Hazel Doughty, Walterio Mayol-Cuevas, Dima Damen\"\r\n370,Modulating Image Restoration With Continual Levels via Adaptive Feature Modification Layers,https://github.com/hejingwenhejingwen/AdaFM,,,,Thursday,Oral 3.2C,14:11,Thursday,Poster 3.2,154,Jingwen He,\"Jingwen He, Chao Dong, and Yu Qiao\"\r\n50,Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis,https://github.com/HelenMao/MSGAN,,,,,,,Tuesday,Poster 1.1,144,Qi Mao,\"Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Siwei Ma, Ming-Hsuan Yang?\"\r\n371,Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition,https://github.com/Heliang-Zheng/TASN,,,,,,,Wednesday,Poster 2.1,46,Heliang Zheng,\r\n159,LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking,https://github.com/HengLan/LaSOT_Evaluation_Toolkit,,https://cis.temple.edu/lasot/,,,,,Wednesday,Poster 2.1,83,Heng Fan,\"Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, Haibin Ling\"\r\n111,Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration,https://github.com/he-y/filter-pruning-geometric-median,,,,Wednesday,Oral 2.1A,9:11,Wednesday,Poster 2.1,8,Yang He,\"Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, Yi Yang\"\r\n156,TextureNet: Consistent Local Parametrizations for Learning From High-Resolution Signals on Meshes,https://github.com/hjwdzh/TextureNet,,,,Wednesday,Oral 2.1B,8:40,Wednesday,Poster 2.1,108,Jingwei Huang,\"Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkhouser, Matthias Niessner, and Leonidas Guibas\"\r\n36,Stereo R-CNN Based 3D Object Detection for Autonomous Driving,https://github.com/HKUST-Aerial-Robotics/Stereo-RCNN,https://mp.weixin.qq.com/s/X6XI7L57fGGGkAUug3E6Dg,,,,,,Wednesday,Poster 2.2,92,Peiliang Li,\"Peiliang Li, Xiaozhi Chen, Shaojie Shen\"\r\n93,PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet,https://github.com/hmgoforth/PointNetLK,,,,,,,Wednesday,Poster 2.2,24,Yasuhiro Aoki,\"Yasuhiro Aoki, Hunter Goforth, Rangaprasad Arun Srivatsan, Simon Lucey\"\r\n373,K-Nearest Neighbors Hashing,https://github.com/HolmesShuan/K-Nearest-Neighbors-Hashing,,,,,,,Tuesday,Poster 1.2,36,Xiangyu He,Xiangyu He and Peisong Wang and Jian Cheng\r\n374,ODE-Inspired Network Design for Single Image Super-Resolution,https://github.com/HolmesShuan/OISR-PyTorch,,,,,,,Tuesday,Poster 1.1,175,Xiangyu He,Xiangyu He and Zitao Mo and Peisong Wang and Yang Liu and Mingyuan Yang and Jian Cheng\r\n917,Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring,https://github.com/HongguangZhang/DMPHN-cvpr19-master,,,,,,,Wednesday,Poster 2.1,174,Hongguang Zhang,\r\n375,Neural Scene Decomposition for Multi-Person Motion Capture,https://github.com/hrhodin/NeuralSceneDecomposition,,,,,,,Wednesday,Poster 2.2,98,Helge Rhodin,\r\n920,FOCNet: A Fractional Optimal Control Network for Image Denoising,https://github.com/hsijiaxidian/FOCNet,,,,,,,Wednesday,Poster 2.1,182,Xixi Jia,\r\n376,PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing,https://github.com/hszhao/PointWeb,,,,,,,Wednesday,Poster 2.1,102,Hengshuang Zhao,\"Hengshuang Zhao*, Li Jiang*, Chi-Wing Fu, and Jiaya Jia\"\r\n222,Iterative Normalization: Beyond Standardization Towards Efficient Whitening,https://github.com/huangleiBuaa/IterNorm,,,,,,,Wednesday,Poster 2.1,32,Lei Huang,\"Lei Huang, Yi Zhou, Fan Zhu, Li Liu, Ling Shao \"\r\n98,Noise-Tolerant Paradigm for Training Face Recognition CNNs,https://github.com/huangyangyu/NoiseFace,,,,,,,Thursday,Poster 3.2,111,Wei Hu,\"Wei Hu, Yangyu Huang, Fan Zhang, Ruirui Li\"\r\n377,Deep Surface Normal Estimation With Hierarchical RGB-D Fusion,https://github.com/huangyunmu/HFMNet_CVPR2019,,,,,,,Wednesday,Poster 2.1,192,Jin Zeng,\r\n378,Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation,https://github.com/hughw19/NOCS_CVPR2019,,,,Tuesday,Oral 1.2C,15:00,Tuesday,Poster 1.2,181,He Wang,\r\n135,Reliable and Efficient Image Cropping: A Grid Anchor Based Approach,https://github.com/HuiZeng/Grid-Anchor-based-Image-Cropping,,,,,,,Wednesday,Poster 2.1,171,Hui Zeng,\"Hui Zeng, Lida Li, Zisheng Cao, Lei Zhang\"\r\n244,Towards Visual Feature Translation,https://github.com/hujiecpp/VisualFeatureTranslation,,,,,,,Tuesday,Poster 1.2,53,Jie Hu,\"Jie Hu, Rongrong Ji, Hong Liu, Shengchuan Zhang, Cheng Deng, Qi Tian\"\r\n279,Efficient Neural Network Compression,https://github.com/Hyeji-Kim/ENC,,,,,,,Thursday,Poster 3.2,198,Hyeji Kim,\"Hyeji Kim, Muhammad Umar Karim Khan, Chong-Min Kyung\"\r\n165,Transfer Learning via Unsupervised Task Discovery for Visual Question Answering,https://github.com/HyeonwooNoh/vqa_task_discovery,,,,,,,Wednesday,Poster 2.2,203,Hyeonwoo Noh,\"Hyeonwoo Noh, Taehoon Kim, Jonghwan Mun, Bohyung Han\"\r\n865,CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning,https://github.com/icoz69/CaNet,,,,,,,Wednesday,Poster 2.1,67,Chi Zhang,\r\n254,Factor Graph Attention,https://github.com/idansc/fga,,,,,,,Tuesday,Poster 1.1,206,Idan Schwartz,\"Idan Schwartz, Seunghak Yu, Tamir Hazan, Alexander Schwing\"\r\n263,A Simple Baseline for Audio-Visual Scene-Aware Dialog,https://github.com/idansc/simple-avsd,,,,,,,Thursday,Poster 3.2,196,Idan Schwartz,\"Idan Schwartz, Alexander Schwing, Tamir Hazan\"\r\n1294,Deep Incremental Hashing Network for Efficient Image Retrieval,https://github.com/IIE-MR/DIHN,,,,,,,Thursday,Poster 3.1,12,Dayan Wu,\r\n85,RVOS: End-To-End Recurrent Network for Video Object Segmentation,https://github.com/imatge-upc/rvos,,,,,,,Wednesday,Poster 2.1,73,Carles Ventura,\"Carles Ventura, Miriam Bellver, Andreu Girbau, Amaia Salvador, Ferran Marques, Xavier Giro-i-Nieto\"\r\n25,Selective Kernel Networks,https://github.com/implus/SKNet,,,,,,,Tuesday,Poster 1.1,28,Xiang Li,\"Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang\"\r\n264,SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration Without Correspondences,https://github.com/intellhave/SDRSAC,,,,Tuesday,Oral 1.1B,9:00,Tuesday,Poster 1.1,72,Huu M. Le,\"Huu Le, Thanh-Toan Do, Tuan Hoang, Ngai-Man Cheung\"\r\n594,Structured Knowledge Distillation for Semantic Segmentation,https://github.com/irfanICMLL/structure_knowledge_distillation,https://mp.weixin.qq.com/s/WvYPf7xHu4TtuDan_Hh_pQ,,,Tuesday,Oral 1.2C,14:34,Tuesday,Poster 1.2,177,Yifan Liu,\"Yifan Liu, Ke Chen, Chris Liu, Zengchang Qin, Zhenbo Luo, Jingdong Wang \"\r\n235,HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs,https://github.com/irvinxav/Efficient-HetConv-Heterogeneous-Kernel-Based-Convolutions,,,,,,,Wednesday,Poster 2.1,28,Pravendra Singh,\"Pravendra Singh, Vinay Kumar Verma, Piyush Rai, Vinay P. Namboodiri \"\r\n379,RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion,https://github.com/iSarmad/RL-GAN-Net,,,,,,,Wednesday,Poster 2.1,166,Muhammad Sarmad,\"Muhammad Sarmad, Hyunjoo Jenny Lee, Young Min Kim\"\r\n380,ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features,https://github.com/ISICV/ManTraNet,,,,,,,Thursday,Poster 3.1,76,Yue Wu,\r\n271,AIRD: Adversarial Learning Framework for Image Repurposing Detection,https://github.com/isi-vista/AIRD-Datasets,,,,,,,Thursday,Poster 3.2,36,Ayush Jaiswal,\"Ayush Jaiswal, Yue Wu, Wael AbdAlmageed, Iacopo Masi, Premkumar Natarajan\"\r\n31,DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion,https://github.com/j96w/DenseFusion,https://mp.weixin.qq.com/s/wrND2cocWlPPVXPqpq-Glg,,,,,,Tuesday,Poster 1.2,87,Chen Wang,\"Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martn-Martn, Cewu Lu, Li Fei-Fei, Silvio Savarese\"\r\n295,DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation,https://github.com/j-a-lin/DFANet_PyTorch,,,,,,,Thursday,Poster 3.1,74,Hanchao Li,\"Hanchao Li, Pengfei Xiong, Haoqiang Fan, Jian Sun\"\r\n381,Adversarial Inference for Multi-Sentence Video Description,https://github.com/jamespark3922/adv-inf,,,,Wednesday,Oral 2.2B,13:40,Wednesday,Poster 2.2,179,Jae Sung Park,\"Jae Sung Park,?Marcus Rohrbach,?Trevor Darrell,?Anna Rohrbach\"\r\n201,Locating Objects Without Bounding Boxes,https://github.com/javiribera/locating-objects-without-bboxes,,,,Wednesday,Oral 2.2A,14:16,Wednesday,Poster 2.2,33,Javier Ribera,\"Javier Ribera, David Gera, Yuhao Chen, Edward J. Delp\"\r\n173,Destruction and Construction Learning for Fine-Grained Image Recognition,https://github.com/JDAI-CV/DCL,,,,,,,Wednesday,Poster 2.1,61,Yue Chen,\r\n83,CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark,https://github.com/Jeff-sjtu/CrowdPose,,,,Thursday,Oral 3.2B,14:06,Thursday,Poster 3.2,98,Jiefeng Li,\"Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, Cewu Lu\"\r\n121,Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses,https://github.com/jeromerony/fast_adversarial,,,,Wednesday,Oral 2.1A,8:58,Wednesday,Poster 2.1,6,Jerome Rony,\"Jr?me Rony, Luiz G. Hafemann, Luiz S. Oliveira, Ismail Ben Ayed, Robert Sabourin, Eric Granger\"\r\n196,REPAIR: Removing Representation Bias by Dataset Resampling,https://github.com/JerryYLi/Dataset-REPAIR,,,,,,,Thursday,Poster 3.1,94,Yi Li,\"Yi Li, Nuno Vasconcelos\"\r\n147,Attention-Aware Multi-Stroke Style Transfer,https://github.com/JianqiangRen/AAMS,,https://sites.google.com/view/yuanyao/attention-aware-multi-stroke-style-transfer,,,,,Tuesday,Poster 1.1,148,Yuan Yao,\"Yuan Yao, Jianqiang Ren, Xuansong Xie, Weidong Liu, Yong-Jin Liu, Jun Wang\"\r\n245,Dance With Flow: Two-In-One Stream Action Detection,https://github.com/jiaozizhao/Two-in-One-ActionDetection,,,,,,,Thursday,Poster 3.1,131,Jiaojiao Zhao,\"Jiaojiao Zhao, Cees G.M. Snoek\"\r\n382,ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples,https://github.com/jiaxiaojunQAQ/Comdefend,https://zhuanlan.zhihu.com/p/62232536,,,,,,Wednesday,Poster 2.1,185,Xiaojun Jia,\"Xiaojun Jia, Xingxing Wei, Xiaochun Cao, Hassan Foroosh\"\r\n696,Attention-Guided Network for Ghost-Free High Dynamic Range Imaging,https://github.com/JimmyChame/The-State-of-the-Art-in-HDR-Deghosting,,,,,,,Tuesday,Poster 1.1,177,Qingsen Yan,\r\n383,Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations,https://github.com/jiwoon-ahn/irn,,,,Tuesday,Oral 1.2A,14:11,Tuesday,Poster 1.2,25,Jiwoon Ahn,\"Jiwoon Ahn, Sunghyun Cho, Suha Kwak\"\r\n1086,CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth,https://github.com/jmfacil/camconvs,,http://webdiis.unizar.es/~jmfacil/camconvs/,,,,,Thursday,Poster 3.2,87,Jose M. Facil,\"Jose M. Facil, Benjamin Ummenhofer, Huizhong Zhou, Luis Montesano, Thomas Brox, Javier Civera\"\r\n246,A Generative Appearance Model for End-To-End Video Object Segmentation,https://github.com/joakimjohnander/agame-vos,,,,Thursday,Oral 3.1C,9:52,Thursday,Poster 3.1,69,Joakim Johnander,\"Joakim Johnander, Martin Danelljan, Emil Brissman, Fahad Shahbaz Khan, Michael Felsberg\"\r\n384,Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization,https://github.com/joe-siyuan-qiao/NeuralRejuvenation-CVPR19,,,,Tuesday,Oral 1.1A,9:36,Tuesday,Poster 1.1,7,Siyuan Qiao,\"Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille\"\r\n385,FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network,https://github.com/jonahthelion/FastDraw,,,,,,,Thursday,Poster 3.2,62,Jonah Philion,Jonah Philion\r\n386,A General and Adaptive Robust Loss Function,https://github.com/jonbarron/robust_loss_pytorch,,,,Wednesday,Oral 2.1A,9:06,Wednesday,Poster 2.1,7,Jonathan T. Barron,Jonathan T. Barron \r\n387,A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision,https://github.com/JosephineRabbit/MLMSNet,,,,,,,Wednesday,Poster 2.2,161,Runmin Wu,\"Runmin Wu, Mengyang Feng, Wenlong Guan, Dong Wang, Huchuan Lu, Errui Ding\"\r\n217,Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation,https://github.com/jspenmar/SAND_features,,,,,,,Wednesday,Poster 2.1,197,Jaime Spencer,\"Jaime Spencer, Richard Bowden, Simon Hadfield\"\r\n166,Noise2Void - Learning Denoising From Single Noisy Images,https://github.com/juglab/n2v,,,,,,,Tuesday,Poster 1.1,215,Alexander Krull,\"Alexander Krull, Tim-Oliver Buchholz, Florian Jug\"\r\n247,Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization,https://github.com/junbaoZHUO/UODTN,,,,,,,Tuesday,Poster 1.1,52,Junbao Zhuo,\"Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang\"\r\n16,Dual Attention Network for Scene Segmentation,https://github.com/junfu1115/DANet,,,,,,,Tuesday,Poster 1.2,67,Jun Fu,\"Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, Hanqing Lu\"\r\n90,FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search,https://github.com/JunrQ/NAS,,,,Thursday,Oral 3.2A,14:34,Thursday,Poster 3.2,12,Bichen Wu,\"Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer\"\r\n388,Attention-Based Dropout Layer for Weakly Supervised Object Localization,https://github.com/junsukchoe/ADL,,,,Tuesday,Oral 1.2A,14:16,Tuesday,Poster 1.2,26,Junsuk Choe,\r\n108,Video Generation From Single Semantic Label Map,https://github.com/junting/seg2vid,,,,,,,Tuesday,Poster 1.2,144,Junting Pan,\"Junting Pan, Chengyu Wang, Xu Jia, Jing Shao, Lu Sheng, Junjie Yan, Xiaogang Wang\"\r\n389,Natural and Realistic Single Image Super-Resolution With Explicit Natural Manifold Discrimination,https://github.com/JWSoh/NatSR,,,,,,,Wednesday,Poster 2.2,158,Jae Woong Soh,\r\n390,Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection,https://github.com/JXingZhao/ContrastPrior,,,,,,,Tuesday,Poster 1.2,164,Jia-Xing Zhao,\"Zhao, Jia-Xing and Cao, Yang and Fan, Deng-Ping and Cheng, Ming-Ming and Li, Xuan-Yi and Zhang, Le\"\r\n391,Graph Convolutional Label Noise Cleaner: Train a Plug-And-Play Action Classifier for Anomaly Detection,https://github.com/jx-zhong-for-academic-purpose/GCN-Anomaly-Detection,,,,,,,Tuesday,Poster 1.1,125,Jia-Xing Zhong,\r\n392,Learning to Compose Dynamic Tree Structures for Visual Contexts,https://github.com/KaihuaTang/VCTree-Visual-Question-Answering,,,,Wednesday,Oral 2.2B,13:53,Wednesday,Poster 2.2,181,Kaihua Tang,\"Kaihua Tang, Hanwang Zhang, Baoyuan Wu, Wenhan Luo, Wei Liu\"\r\n393,A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations,https://github.com/kakumarabhishek/KernelizedManifoldMapping-1,,,,,,,Thursday,Poster 3.2,37,Saeid Asgari Taghanaki,\"Saeid Asgari Taghanaki, Kumar Abhishek, Shekoofeh Azizi, Ghassan Hamarneh\"\r\n296,QATM: Quality-Aware Template Matching for Deep Learning,https://github.com/kamata1729/QATM_pytorch,,,,,,,Thursday,Poster 3.2,59,Jiaxin Cheng,\"Jiaxin Cheng, Yue Wu, Wael Abd-Almageed, Premkumar Natarajan \"\r\n284,Complete the Look: Scene-Based Complementary Product Recommendation,https://github.com/kang205/STL-Dataset,,,,,,,Thursday,Poster 3.1,207,Wang-Cheng Kang,\"Wang-Cheng Kang, Eric Kim, Jure Leskovec, Charles Rosenberg, Julian McAuley\"\r\n394,Semantic Projection Network for Zero- and Few-Label Semantic Segmentation,https://github.com/kazuto1011/deeplab-pytorch,,,,,,,Wednesday,Poster 2.2,172,Yongqin Xian,\r\n180,Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation,https://github.com/Kelym/FAST,,,,Wednesday,Oral 2.2B,15:05,Wednesday,Poster 2.2,193,Liyiming Ke,\"Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa\"\r\n236,Edge-Labeling Graph Neural Network for Few-Shot Learning,https://github.com/khy0809/fewshot-egnn,,,,Tuesday,Oral 1.1A,9:05,Tuesday,Poster 1.1,2,Jongmin Kim,\"Jongmin Kim, Taesup Kim, Sungwoong Kim, Chang D. Yoo\"\r\n63,ScratchDet: Exploring to Train Single-Shot Object Detectors from Scratch,https://github.com/KimSoybean/ScratchDet,https://mp.weixin.qq.com/s/TZj0QzDXE6QbCY5-pT6RNQ,,,Tuesday,Oral 1.2A,14:47,Tuesday,Poster 1.2,31,Rui Zhu,\"Rui Zhu, Shifeng Zhang, Xiaobo Wang, Longyin Wen, Hailin Shi, Liefeng Bo, Tao Mei?\"\r\n75,Meta-Learning With Differentiable Convex Optimization,https://github.com/kjunelee/MetaOptNet,,,,Thursday,Oral 3.2A,13:48,Thursday,Poster 3.2,4,Kwonjoon Lee,\"Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto\"\r\n112,FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery,https://github.com/kkanshul/finegan,,,,Wednesday,Oral 2.2A,14:24,Wednesday,Poster 2.2,34,Krishna Kumar Singh,\"Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee\"\r\n70,Unsupervised Person Re-Identification by Soft Multilabel Learning,https://github.com/KovenYu/MAR,,,,Tuesday,Oral 1.2A,13:35,Tuesday,Poster 1.2,19,Hong-Xing Yu,\"Hong-Xing Yu, Wei-Shi Zheng, Ancong Wu, Xiaowei Guo, Shaogang Gong, Jian-Huang Lai?\"\r\n272,Representation Similarity Analysis for Efficient Task Taxonomy & Transfer Learning,https://github.com/kshitijd20/RSA-CVPR19-release,,,,,,,Thursday,Poster 3.2,180,Kshitij Dwivedi,\"Kshitij Dwivedi, Gemma Roig\"\r\n395,DeepCO3: Deep Instance Co-Segmentation by Co-Peak Search and Co-Saliency Detection,https://github.com/KuangJuiHsu/DeepCO3?,,,,Thursday,Oral 3.1C,8:48,Thursday,Poster 3.1,58,Kuang-Jui Hsu,\"Kuang-Jui Hsu, Yen-Yu Lin, Yung-Yu Chuang\"\r\n396,Deep Metric Learning to Rank,https://github.com/kunhe/FastAP-metric-learning,,,,,,,Tuesday,Poster 1.1,188,Fatih Cakir,\"Fatih Cakir*,?Kun He*,?Xide Xia,?Brian Kulis, and?Stan Sclaroff\"\r\n297,Additive Adversarial Learning for Unbiased Authentication,https://github.com/langlrsw/AAL-unbiased-authentication,,,,,,,Thursday,Poster 3.2,46,Jian Liang,\"Jian Liang, Yuren Cao, Chenbin Zhang, Shiyu Chang, Kun Bai, Zenglin Xu\"\r\n1098,HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds,https://github.com/laoreja/HPLFlowNet,,,,,,,Tuesday,Poster 1.2,78,Xiuye Gu,\r\n248,Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images,https://github.com/LARC-CMU-SMU/ACME,,,,,,,Thursday,Poster 3.2,61,Hao Wang,\"Hao Wang, Doyen Sahoo, Chenghao Liu, Ee-peng Lim, Steven C. H. Hoi?\"\r\n820,3DN: 3D Deformation Network,https://github.com/laughtervv/3DN,,,,,,,Tuesday,Poster 1.1,93,Weiyue Wang,\r\n223,Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics,https://github.com/laura-wang/video_repres_mas,,,,,,,Tuesday,Poster 1.2,190,Jiangliu Wang,\"Jiangliu Wang, Jianbo Jiao, Linchao Bao, Shengfeng He, Yunhui Liu, Wei Liu\"\r\n846,Explore-Exploit Graph Traversal for Image Retrieval,https://github.com/layer6ai-labs/egt,,,,,,,Thursday,Poster 3.1,49,Cheng Chang,\r\n11,Joint Discriminative and Generative Learning for Person Re-Identification,https://github.com/layumi/Person_reID_baseline_pytorch,https://mp.weixin.qq.com/s/51X7NIS1UEXJ1sqx54tuwg,,,Tuesday,Oral 1.2A,13:30,Tuesday,Poster 1.2,18,Zhedong Zheng,\"Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, Jan Kautz\"\r\n298,\"Isospectralization, or How to Hear Shape, Style, and Correspondence\",https://github.com/lcosmo/isospectralization,,,,,,,Wednesday,Poster 2.2,80,Luca Cosmo,\"Luca Cosmo, Mikhail Panine, Arianna Rampini, Maks Ovsjanikov, Michael M. Bronstein, Emanuele Rodol\"\r\n397,Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss,https://github.com/lelechen63/ATVGnet,,,,,,,Wednesday,Poster 2.2,111,Lele Chen,\"Lele Chen , Ross K Maddox, Zhiyao Duan, Chenliang Xu\"\r\n398,Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence,https://github.com/lelimite4444/BridgeDepthFlow ,,,,,,,Tuesday,Poster 1.1,191,Hsueh-Ying Lai,\"Hsueh-Ying Lai, Yi-Hsuan Tsai, Wei-Chen Chiu\"\r\n504,LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning,https://github.com/leokarlin/LaSO,,,,Wednesday,Oral 2.2A,15:00,Wednesday,Poster 2.2,40,Amit Alfassy,\r\n127,\"Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres\",https://github.com/leoshine/Spherical_Regression,,,,,,,Thursday,Poster 3.1,113,Shuai Liao,\"Shuai Liao, Efstratios Gavves, Cees G. M. Snoek\"\r\n10,Deep High-Resolution Representation Learning for Human Pose Estimation,https://github.com/leoxiaobin/deep-high-resolution-net.pytorch,https://mp.weixin.qq.com/s/ZRCzBTBmlEzQCVo1HLWtbQ,,,,,,Wednesday,Poster 2.1,130,Ke Sun,\"Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang\"\r\n399,Multi-Target Embodied Question Answering,https://github.com/lichengunc/mteqa,,,,,,,Wednesday,Poster 2.1,208,Licheng Yu,\"Yu, Licheng and Chen, Xinlei and Gkioxari, Georgia and Bansal, Mohit and Berg, Tamara L and Batra, Dhruv\"\r\n202,Leveraging the Invariant Side of Generative Zero-Shot Learning,https://github.com/lijin118/LisGAN,,,,,,,Wednesday,Poster 2.2,67,Jingjing Li,\"Jingjing Li, Mengmeng Jin, Ke Lu, Zhengming Ding, Lei Zhu, Zi Huang\"\r\n400,Learning to Learn From Noisy Labeled Data,https://github.com/LiJunnan1992/MLNT,,,,,,,Wednesday,Poster 2.1,50,Junnan Li,\"Junnan Li, Y. Wong, Qi Zhao, M. Kankanhalli \"\r\n187,Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition,https://github.com/limaosen0/AS-GCN,,,,,,,Tuesday,Poster 1.2,112,Maosen Li,\"Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, and Qi Tian \"\r\n53,Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation,https://github.com/LinZhuoChen/DUpsampling,,,,,,,Tuesday,Poster 1.2,65,Zhi Tian,\"Zhi Tian, Chunhua Shen, Tong He, Youliang Yanl\"\r\n197,Label Efficient Semi-Supervised Learning via Graph Filtering,https://github.com/liqimai/Efficient-SSL,,,,,,,Thursday,Poster 3.1,95,Qimai Li,Qimai LiXiao-Ming WuHan LiuXiaotong Zhang\r\n87,Pose2Seg: Detection Free Human Instance Segmentation,https://github.com/liruilong940607/Pose2Seg,,http://www.liruilong.cn/Pose2Seg/index.html?,,,,,Tuesday,Poster 1.1,66,Song-Hai Zhang,\"Song-Hai Zhang, Ruilong Li, Xin Dong, Paul L. Rosin, Zixi Cai, Han Xi, Dingcheng Yang, Hao-Zhi Huang, Shi-Min Hu\"\r\n181,Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning,https://github.com/liruoteng/HeavyRainRemoval,,,,,,,Tuesday,Poster 1.1,165,Ruoteng Li,\"Ruotent Li, Loong Fah Cheong, Robby T. Tan\"\r\n211,Lending Orientation to Neural Networks for Cross-View Geo-Localization,https://github.com/Liumouliu/OriCNN,,,,,,,Wednesday,Poster 2.1,123,Liu Liu,\"Liu Liu, Hongdong Li\"\r\n218,Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration,https://github.com/liu-vis/DualResidualNetworks,,,,,,,Wednesday,Poster 2.2,8,Xing Liu,\"Xing Liu, Masanori Suganuma, Zhun Sun, Takayuki Okatani\"\r\n34,High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection,https://github.com/liuwei16/CSP,,,,,,,Wednesday,Poster 2.1,64,Wei Liu,\"Zhao-Min Chen, Xiu-Shen Wei Peng Wang3Yanwen Guo\"\r\n106,Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search,https://github.com/lixincn2015/Partial-Order-Pruning,,,,,,,Thursday,Poster 3.1,20,Xin Li,\"Xin Li, Yiming Zhou, Zheng Pan, Jiashi Feng \"\r\n128,Bidirectional Learning for Domain Adaptation of Semantic Segmentation,https://github.com/liyunsheng13/BDL,,https://voca.is.tue.mpg.de/,,,,,Wednesday,Poster 2.2,1,Yunsheng Li,\"Yunsheng Li, Lu Yuan, Nuno Vasconcelos\"\r\n57,Occupancy Networks: Learning 3D Reconstruction in Function Space,https://github.com/LMescheder/Occupancy-Networks,,,,Wednesday,Oral 2.1B,8:53,Wednesday,Poster 2.1,110,Lars Mescheder,Lars Mescheder and Michael Oechsle and Michael Niemeyer and Sebastian Nowozin and Andreas Geiger\r\n401,Learning Parallax Attention for Stereo Image Super-Resolution,https://github.com/LongguangWang/PASSRnet,,,,,,,Thursday,Poster 3.2,166,Longguang Wang,\"Longguang Wang, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jungang Yang, Wei An, Yulan Guo\"\r\n120,Spatiotemporal CNN for Video Object Segmentation,https://github.com/longyin880815/STCNN,,,,,,,Tuesday,Poster 1.1,139,Kai Xu,\"Kai Xu, Longyin Wen, Guorong Li, Liefeng Bo, Qingming Huang\"\r\n109,End-To-End Multi-Task Learning With Attention,https://github.com/lorenmt/mtan,,,,,,,Tuesday,Poster 1.1,189,Shikun Liu,\"Shikun Liu, Edward Johns, Andrew J. Davison\"\r\n285,MAP Inference via Block-Coordinate Frank-Wolfe Algorithm,https://github.com/LPMP/LPMP,,,,Thursday,Oral 3.2C,15:05,Thursday,Poster 3.2,163,Paul Swoboda,\"Paul Swoboda, Vladimir Kolmogorov\"\r\n402,Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition,https://github.com/lshiwjx/2s-AGCN,,,,,,,Thursday,Poster 3.2,125,Lei Shi,\r\n152,Single Image Deraining: A Comprehensive Benchmark Analysis,https://github.com/lsy17096535/Single-Image-Deraining,,,,,,,Tuesday,Poster 1.2,155,Siyuan Li,\"Siyuan Li, Iago Breno Araujo, Wenqi Ren, Zhangyang Wang, Eric K. Tokuda, Roberto Hirata Junior, Roberto Cesar-Junior, Jiawan Zhang, Xiaojie Guo, Xiaochun Cao\"\r\n188,SparseFool: A Few Pixels Make a Big Difference,https://github.com/LTS4/SparseFool,,,,,,,Thursday,Poster 3.1,14,Apostolos Modas,\"Apostolos Modas, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard\"\r\n155,Taking a Deeper Look at the Inverse Compositional Algorithm,https://github.com/lvzhaoyang/DeeperInverseCompositionalAlgorithm,,,,Wednesday,Oral 2.1C,8:35,Wednesday,Poster 2.1,136,Zhaoyang Lv,\"Zhaoyang Lv, Frank Dellaert, James M. Rehg, Andreas Geiger\"\r\n403,Dense Intrinsic Appearance Flow for Human Pose Transfer,https://github.com/ly015/intrinsic_flow?,,,,,,,Tuesday,Poster 1.2,140,Yining Li,\"Yining Li, Chen Huang, Chen Change Loy\"\r\n60,Pluralistic Image Completion,https://github.com/lyndonzheng/Pluralistic-Inpainting,,http://www.chuanxiaz.com/publication/pluralistic/,,,,,Tuesday,Poster 1.1,145,Chuanxia Zheng,\"Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai?\"\r\n404,TopNet: Structural Point Cloud Decoder,https://github.com/lynetcha/completion3d,,,,,,,Tuesday,Poster 1.1,15,Lyne P. Tchapmi,\"Tchapmi, Lyne P and Kosaraju, Vineet and Rezatofighi, S. Hamid and Reid, Ian and Savarese, Silvio\"\r\n113,Attention Branch Network: Learning of Attention Mechanism for Visual Explanation,https://github.com/machine-perception-robotics-group/attention_branch_network,,,,Thursday,Oral 3.2A,14:16,Thursday,Poster 3.2,9,Hiroshi Fukui,\"Hiroshi Fukui, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi\"\r\n405,End-To-End Efficient Representation Learning via Cascading Combinatorial Optimization,https://github.com/maestrojeong/Deep-Hash-Table-CVPR19?,,,,,,,Thursday,Poster 3.2,41,Yeonwoo Jeong,\"Yeonwoo Jeong, Yoonsuing Kim, Hyun Oh Song\"\r\n406,Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning,https://github.com/MalongTech/research-ms-loss?,,,,,,,Wednesday,Poster 2.1,47,Xun Wang,\"Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, Matthew R. Scott\"\r\n189,Unsupervised Embedding Learning via Invariant and Spreading Instance Feature,https://github.com/mangye16/Unsupervised_Embedding_Learning,,,,,,,Wednesday,Poster 2.1,198,Mang Ye,\"Mang Ye, Xu Zhang, Pong C. Yuen, Shih-Fu Chang\"\r\n407,Seamless Scene Segmentation,https://github.com/mapillary/seamseg,,,,,,,Wednesday,Poster 2.2,174,Lorenzo Porzi,\"Lorenzo Porzi, Samuel Rota Bul, Aleksander Colovic, Peter Kontschieder\"\r\n198,AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations Rather Than Data,https://github.com/maple-research-lab/AET,,,,Tuesday,Oral 1.2C,13:58,Tuesday,Poster 1.2,171,Liheng Zhang,\"Liheng Zhang, Guo-Jun Qi, Liqiang Wang, Jiebo Luo\"\r\n408,Light Field Messaging With Deep Photographic Steganography,https://github.com/mathski/LFM,,,,,,,Tuesday,Poster 1.1,153,Eric Wengrowski,Eric Wengrowski and Kristin Dana\r\n280,Robustness of 3D Deep Learning in an Adversarial Setting,https://github.com/matthewwicker/IterativeSalienceOcclusion,,,,,,,Thursday,Poster 3.2,81,Matthew Wicker,\"Matthew Wicker, Marta Kwiatkowska\"\r\n409,Guided Stereo Matching,https://github.com/mattpoggi/guided-stereo,,,,,,,Tuesday,Poster 1.1,87,Matteo Poggi,\" Matteo Poggi, Davide Pallotti, Fabio Tosi and Stefano Mattoccia,\"\r\n237,A Cross-Season Correspondence Dataset for Robust Semantic Segmentation,https://github.com/maunzzz/cross-season-segmentation,,,,,,,Thursday,Poster 3.1,75,Mans Larsson,\"M?ns Larsson, Erik Stenborg, Lars Hammarstrand, Torsten Sattler, Mark Pollefeys, Fredrik Kahl \"\r\n410,Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem,https://github.com/max-andr/relu_networks_overconfident,,,,Tuesday,Oral 1.1A,9:23,Tuesday,Poster 1.1,5,Matthias Hein,\"Matthias Hein, Maksym Andriushchenko, Julian Bitterwolf\"\r\n897,Rethinking the Evaluation of Video Summaries,https://github.com/mayu-ot/rethinking-evs,,,,,,,Wednesday,Poster 2.2,87,Mayu Otani,\r\n411,Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence,https://github.com/mcahny/Deep_Blind_Video_Decaptioning?,,,,,,,Tuesday,Poster 1.2,216,Dahun Kim,\"Dahun Kim, Sanghyun Woo, Joon-Young Lee, In So Kweon\"\r\n412,Deep Video Inpainting,https://github.com/mcahny/Deep_Video_Inpainting,,,,,,,Wednesday,Poster 2.1,155,Dahun Kim,\"Dahun Kim, Sanghyun Woo, Joon-Young Lee, In So Kweon\"\r\n413,Learning to Extract Flawless Slow Motion From Blurry Videos,https://github.com/MeiguangJin/slow-motion,,,,,,,Wednesday,Poster 2.2,157,Meiguang Jin,\"Jin, Meiguang and Hu, Zhe and Favaro, Paolo\"\r\n157,SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Images,https://github.com/MeioJane/SIXray,,,,,,,Tuesday,Poster 1.1,214,Caijing Miao,\"Caijing Miao, Lingxi Xie, Fang Wan, Chi Su, Hongye Liu, Jianbin Jiao, Qixiang Ye\"\r\n414,Self-Supervised 3D Hand Pose Estimation Through Training by Fitting,https://github.com/melonwan/sphereHand,,,,Thursday,Oral 3.2B,13:58,Thursday,Poster 3.2,97,Chengde Wan,\r\n38,Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation,https://github.com/MenghaoGuo/AutoDeeplab,,,,Tuesday,Oral 1.1A,9:46,Tuesday,Poster 1.1,9,Chenxi Liu,\"Peng Zhang, Fuhao Zou, Zhiwen Wu, Nengli Dai, Skarpness Mark, Michael Fu, Juan Zhao, Kai Li\"\r\n273,SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception,https://github.com/mengyuest/SIGNet,,https://mengyuest.github.io/SIGNet/,,,,,Thursday,Poster 3.1,118,Yue Meng,\"Y. Meng, Y. Lu, A. Raj, S. Sunarjo, R. Guo, T. Javidi, G. Bansal, D. Bharadia\"\r\n249,Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images,https://github.com/mibastro/VPE,,,,,,,Thursday,Poster 3.1,53,Junsik Kim,\"Junsik Kim, Tae-Hyun Oh, Seokju Lee, Fei Pan, In So Kweon\"\r\n78,D2-Net: A Trainable CNN for Joint Description and Detection of Local Features,https://github.com/mihaidusmanu/d2-net,,https://dsmn.ml/publications/d2-net.html,,,,,Wednesday,Poster 2.2,155,Mihai Dusmanu,\"Mihai Dusmanu,?Ignacio Rocco,?Tomas Pajdla,?Marc Pollefeys,?Josef Sivic,?Akihiko Torii,?Torsten Sattler\"\r\n49,Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving,https://github.com/mileyan/pseudo_lidar,,https://mileyan.github.io/pseudo_lidar/,,,,,Wednesday,Poster 2.2,209,Yan Wang,\"Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger\"\r\n415,Deep Modular Co-Attention Networks for Visual Question Answering,https://github.com/MILVLG/mcan-vqa,,,,,,,Wednesday,Poster 2.1,205,Zhou Yu,\"Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Qi Tian\"\r\n20,HAQ: Hardware-Aware Automated Quantization With Mixed Precision,https://github.com/MIT-HAN-LAB/ProxylessNAS,,https://hanlab.mit.edu/projects/haq/,,Thursday,Oral 3.1A,9:24,Thursday,Poster 3.1,199,Kuan Wang,\"Kuan Wang,?Zhijian Liu,?Yujun Lin,?Ji Lin,?Song Han\"\r\n41,Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry,https://github.com/mkocabas/EpipolarPose,,,,,,,Tuesday,Poster 1.1,97,Muhammed Kocabas,\"Muhammed Kocabas, Salih Karagoz, Emre Akbas\"\r\n89,PoseFix: Model-Agnostic General Human Pose Refinement Network,https://github.com/mks0601/PoseFix_RELEASE,,,,,,,Wednesday,Poster 2.2,105,Gyeongsik Moon,\"Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee \"\r\n286,Meta-Learning Convolutional Neural Architectures for Multi-Target Concrete Defect Classification With the COncrete DEfect BRidge IMage Dataset,https://github.com/MrtnMndt/meta-learning-CODEBRIM,,,,,,,Thursday,Poster 3.2,22,Martin Mundt,\"Martin Mundt, Sagnik Majumder, Sreenivas Murali, Panagiotis Panetsos, Visvanathan Ramesh\"\r\n5,\"Deformable ConvNets V2: More Deformable, Better Results\",https://github.com/msracver/Deformable-ConvNets,,,,,,,Thursday,Poster 3.1,37,Xizhou Zhu,\"Xizhou Zhu,?Han Hu,?Stephen Lin,?Jifeng Dai\"\r\n212,Query-Guided End-To-End Person Search,https://github.com/munjalbharti/Query-guided-End-to-End-Person-Search,,,,,,,Tuesday,Poster 1.1,58,Bharti Munjal,\"Bharti Munjal, Sikandar Amin, Federico Tombari, Fabio Galasso\"\r\n203,Sliced Wasserstein Generative Models,https://github.com/musikisomorphie/swd,,,,,,,Tuesday,Poster 1.2,142,Jiqing Wu,\"Jiqing Wu, Zhiwu Huang, Dinesh Acharya, Wen Li, Janine Thoma, Danda Pani Paudel, Luc Van Gool\"\r\n416,Unsupervised Face Normalization With Extreme Pose and Expression in the Wild,https://github.com/mx54039q/fnm,,,,,,,Thursday,Poster 3.1,122,Yichen Qian,\"Qian, Yichen and Deng, Weihong and Hu, Jiani\"\r\n500,Self-Supervised Representation Learning From Videos for Facial Action Unit Detection,https://github.com/mysee1989/TCAE,,,,Thursday,Oral 3.2B,14:42,Thursday,Poster 3.2,104,Yong Li,\r\n921,RegularFace: Deep Face Recognition via Exclusive Regularization,https://github.com/naimple/regular-face-cvpr2019,,,,,,,Tuesday,Poster 1.1,103,Kai Zhao,\r\n502,SiCloPe: Silhouette-Based Clothed People,https://github.com/nanopoteto/SiCloPe_CVPR,,,,Wednesday,Oral 2.1B,9:06,Wednesday,Poster 2.1,112,Ryota Natsume,\r\n287,Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition,https://github.com/naraysa/gzsl-od,,,,,,,Thursday,Poster 3.1,136,Devraj Mandal,\"Devraj Mandal, Sanath Narayan, Saikumar Dwivedi, Vikram Gupta, Shuaib Ahmed, Fahad Shahbaz Khan, Ling Shao\"\r\n219,BASNet: Boundary-Aware Salient Object Detection,https://github.com/NathanUA/BASNet,,,,,,,Wednesday,Poster 2.2,75,Xuebin Qin,\"Zichen Zhang, Chenyang Huang, Chao Gao, Masood Dehghan and Martin Jagersand\"\r\n160,Camera Lens Super-Resolution,https://github.com/ngchc/CameraSR,,,,,,,Tuesday,Poster 1.1,167,Chang Chen,\"Chang Chen, Zhiwei Xiong, Xinmei Tian, Zheng-Jun Zha, Feng Wu\"\r\n74,Style Transfer by Relaxed Optimal Transport and Self-Similarity,https://github.com/nkolkin13/STROTSS,,,,,,,Thursday,Poster 3.1,143,Nicholas Kolkin,\"Nicholas Kolkin, Jason Salavon, Gregory Shakhnarovich \"\r\n417,Convolutional Mesh Regression for Single-Image Human Shape Reconstruction,https://github.com/nkolot/GraphCMR,,,,Wednesday,Oral 2.1B,9:16,Wednesday,Poster 2.1,114,Nikos Kolotouros,\"Nikos Kolotouros, Georgios Pavlakos, Kostas Daniilidis\"\r\n213,Timeception for Complex Action Recognition,https://github.com/noureldien/timeception,,,,Tuesday,Oral 1.1C,9:05,Tuesday,Poster 1.1,110,Noureldien Hussein,\"Noureldien Hussein, Efstratios Gavves, Arnold W.M. Smeulders\"\r\n914,OCGAN: One-Class Novelty Detection Using GANs With Constrained Latent Representations,https://github.com/nuclearboy95/Anomaly-Detection-OCGAN-tensorflow,https://mp.weixin.qq.com/s/FXKWP4OlUsprSDgU7ue8CA?,,,,,,Tuesday,Poster 1.2,42,Pramuditha Perera,\"Pramuditha Perera, Ramesh Nallapati, Bing Xiang\"\r\n46,PointPillars: Fast Encoders for Object Detection From Point Clouds,https://github.com/nutonomy/second.pytorch,,,,,,,Thursday,Poster 3.2,211,Alex H. Lang,\"Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom\"\r\n418,Improving Semantic Segmentation via Video Propagation and Label Relaxation,https://github.com/NVIDIA/semantic-segmentation,,,,Thursday,Oral 3.1C,8:53,Thursday,Poster 3.1,59,Yi Zhu,\"Yi Zhu, Karan Sapra, Fitsum A. Reda, Kevin J. Shih, Shawn Newsam, Andrew Tao, Bryan Catanzaro\"\r\n419,Pixel-Adaptive Convolutional Neural Networks,https://github.com/NVlabs/pacnet,,,,,,,Thursday,Poster 3.2,19,Hang Su,\"Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, and Jan Kautz.\"\r\n40,PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image,https://github.com/NVlabs/planercnn,,https://research.nvidia.com/publication/2019-06_PlaneRCNN,,Wednesday,Oral 2.1B,8:48,Wednesday,Poster 2.1,109,Chen Liu,\"Chen Liu, Kihwan Kim, Jinwei Gu, Yasutaka Furukawa, Jan Kautz\"\r\n420,SCOPS: Self-Supervised Co-Part Segmentation,https://github.com/NVlabs/SCOPS,,https://varunjampani.github.io/scops/,,,,,Tuesday,Poster 1.1,64,Wei-Chih Hung,Wei-Chih Hung Varun Jampani Sifei Liu Pavlo Molchanov Ming-Hsuan Yang Jan Kautz\r\n3,Semantic Image Synthesis With Spatially-Adaptive Normalization,https://github.com/NVlabs/SPADE,,,,Tuesday,Oral 1.2B,13:40,Tuesday,Poster 1.2,120,Taesung Park,\"Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu\"\r\n2,A Style-Based Generator Architecture for Generative Adversarial Networks,https://github.com/NVlabs/stylegan,,,GAN,Wednesday,Oral 2.1A,9:47,Wednesday,Poster 2.1,14,Tero Karras,\"Tero Karras (NVIDIA), Samuli Laine (NVIDIA), Timo Aila (NVIDIA)\"\r\n421,Importance Estimation for Neural Network Pruning,https://github.com/NVlabs/Taylor_pruning,,,,,,,Thursday,Poster 3.2,29,Pavlo Molchanov,\"Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz\"\r\n422,Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations,https://github.com/nv-tlabs/STEAL,,https://nv-tlabs.github.io/STEAL/?,,Thursday,Oral 3.2C,14:24,Thursday,Poster 3.2,156,David Acuna,\"David Acuna, Amlan Kar, Sanja Fidler\"\r\n423,Libra R-CNN: Towards Balanced Learning for Object Detection,https://github.com/OceanPang/Libra_R-CNN,,,,,,,Tuesday,Poster 1.1,59,Jiangmiao Pang,\r\n424,3D Appearance Super-Resolution With Deep Learning,https://github.com/ofsoundof/3D_Appearance_SR,,,,,,,Thursday,Poster 3.1,104,Yawei Li,\"Yawei Li, Vagia Tsiminaki, Radu Timofte, Marc Pollefeys, Luc van Gool\"\r\n58,DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation,https://github.com/Oktosha/DeepSDF-explained?,,,,Tuesday,Oral 1.1B,9:23,Tuesday,Poster 1.1,76,Jeong Joon Park,\"Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, Steven Lovegrove\"\r\n425,Hybrid Task Cascade for Instance Segmentation,https://github.com/open-mmlab/mmdetection,https://mp.weixin.qq.com/s/xug0xKfc9RgJEUci1a_xog,,,,,,Wednesday,Poster 2.1,42,Kai Chen,\"Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin\"\r\n426,Region Proposal by Guided Anchoring,https://github.com/open-mmlab/mmdetection,https://mp.weixin.qq.com/s/Sl958JkcJjy-HW9_c-SH4g,,,,,,Tuesday,Poster 1.2,49,Jiaqi Wang,\"Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, Dahua Lin\"\r\n170,Learning to Sample,https://github.com/orendv/learning_to_sample,,,,,,,Tuesday,Poster 1.2,10,Oren Dovrat,\"Oren Dovrat, Itai Lang, Shai Avidan\"\r\n91,In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images,https://github.com/orsic/swiftnet,,,,,,,Thursday,Poster 3.2,202,Marin Orsic,\"Marin Or?i?, Ivan Kre?o, Petra Bevandi?, Sini?a ?egvi?\"\r\n517,Unsupervised Learning of Dense Shape Correspondence,https://github.com/OshriHalimi/unsupervised_learning_of_dense_shape_correspondence,,,,Wednesday,Oral 2.1A,9:29,Wednesday,Poster 2.1,11,Oshri Halimi,\r\n427,Translate-to-Recognize Networks for RGB-D Scene Recognition,https://github.com/ownstyledu/Translate-to-Recognize-Networks,,https://ownstyledu.github.io/Translate-to-Recognize-Networks/,,,,,Thursday,Poster 3.2,88,Dapeng Du,\"Du, Dapeng and Wang, Limin and Wang, Huiling and Zhao, Kai and Wu, Gangshan\"\r\n265,Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera,https://github.com/panpanfei/Bringing-a-Blurry-Frame-Alive-at-High-Frame-Rate-with-an-Event-Camera,,,,Wednesday,Oral 2.2C,14:06,Wednesday,Poster 2.2,136,Liyuan Pan,\"Liyuan Pan, Cedric Scheerlinck, Xin Yu, Richard Hartley, Miaomiao Liu, Yuchao Dai\"\r\n64,Feedback Network for Image Super-Resolution,https://github.com/Paper99/SRFBN_CVPR19,,,,,,,Tuesday,Poster 1.2,158,Zhen Li,\"Zhen Li, Jinglei Yang, Zheng Liu, Xiaomin Yang, Gwanggil Jeon, Wei Wu\"\r\n167,Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids,https://github.com/paschalidoud/superquadric_parsing,,,,,,,Thursday,Poster 3.1,173,Despoina Paschalidou,\"Despoina Paschalidou, Ali Osman Ulusoy, Andreas Geiger\"\r\n428,SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks,https://github.com/PengBoXiangShang/SiamRPN_plus_plus_PyTorch,https://mp.weixin.qq.com/s/dB5u2No8eakLnrjto0kvyQ,http://bo-li.info/SiamRPN++/,,Wednesday,Oral 2.1A,8:35,Wednesday,Poster 2.1,2,Bo Li,\"Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, Junjie Yan\"\r\n429,Representation Flow for Action Recognition,https://github.com/piergiaj/representation-flow-cvpr19,,https://piergiaj.github.io/rep-flow-site/,,,,,Thursday,Poster 3.1,132,AJ Piergiovanni,\"AJ Piergiovanni, Michael S. Ryoo\"\r\n266,Content Authentication for Neural Imaging Pipelines: End-To-End Optimization of Photo Provenance in Complex Distribution Channels,https://github.com/pkorus/neural-imaging,,,,Thursday,Oral 3.1A,9:29,Thursday,Poster 3.1,200,Pawel Korus,\"Pawel Korus, Nasir Memon\"\r\n125,Bi-Directional Cascade Network for Perceptual Edge Detection,https://github.com/pkuCactus/BDCN,,,,,,,Tuesday,Poster 1.2,154,Jianzhong He,\"Jianzhong He, Shiliang Zhang, Ming Yang, Yanhu Shan, Tiejun Huang\"\r\n430,SelFlow: Self-Supervised Learning of Optical Flow,https://github.com/ppliuboy/SelFlow,,,,Wednesday,Oral 2.1C,8:30,Wednesday,Poster 2.1,135,Pengpeng Liu,\r\n190,JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds With Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields,https://github.com/pqhieu/JSIS3D,,https://pqhieu.github.io/cvpr19.html?,,Thursday,Oral 3.1C,8:35,Thursday,Poster 3.1,56,Quang-Hieu Pham,\"Quang-Hieu Pham, Duc Thanh Nguyen, Binh-Son Hua, Gemma Roig, Sai-Kit Yeung \"\r\n431,Learning Single-Image Depth From Videos Using Quality Assessment Networks,https://github.com/princeton-vl/YouTube3D,,,,,,,Wednesday,Poster 2.1,121,Weifeng Chen,\r\n432,Describing Like Humans: On Diversity in Image Captioning,https://github.com/qingzwang/DiversityMetrics,,,,,,,Tuesday,Poster 1.2,209,Qingzhong Wang,\"Qingzhong Wang, Antoni B. Chan\"\r\n1090,Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification,https://github.com/QizeYang/PAUL,,,,,,,Tuesday,Poster 1.2,116,Qize Yang,\r\n433,Amodal Instance Segmentation With KINS Dataset,https://github.com/qqlu/Amodal-Instance-Segmentation-through-KINS-Dataset,,,,,,,Tuesday,Poster 1.2,54,Lu Qi,\" Lu Qi, Li Jiang, Shu Liu, Xiaoyong Shen, Jiaya Jia\"\r\n434,Information Maximizing Visual Question Generation,https://github.com/ranjaykrishna/iq,,,,,,,Tuesday,Poster 1.1,203,Ranjay Krishna,\"Krishna, Ranjay and Bernstein, Michael and Fei-Fei, Li\"\r\n182,Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting,https://github.com/researchmm/PEN-Net-for-Inpainting,,,,,,,Tuesday,Poster 1.1,150,Yanhong Zeng,\"Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo\"\r\n39,Deeper and Wider Siamese Networks for Real-Time Visual Tracking,https://github.com/researchmm/SiamDW,,,,Wednesday,Oral 2.1C,8:40,Wednesday,Poster 2.1,137,Zhipeng Zhang,\"Zhipeng Zhang, Houwen Peng\"\r\n71,Class-Balanced Loss Based on Effective Number of Samples,https://github.com/richardaecn/class-balanced-loss,,,,,,,Thursday,Poster 3.1,33,Yin Cui,\"Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, Serge Belongie\"\r\n299,Understanding the Disharmony Between Dropout and Batch Normalization by Variance Shift,https://github.com/rileykwok/Food-Classification,,,,,,,Tuesday,Poster 1.2,2,Xiang Li,\"Xiang Li, Shuo Chen, Xiaolin Hu, Jian Yang\"\r\n214,Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos,https://github.com/RomeroBarata/skeleton_based_anomaly_detection,,,,,,,Thursday,Poster 3.2,122,Romero Morais,\"Romero Morais, Vuong Le, Truyen Tran, Budhaditya Saha, Moussa Mansour, Svetha Venkatesh \"\r\n37,From Recognition to Cognition: Visual Commonsense Reasoning,https://github.com/rowanz/r2c,,,,Wednesday,Oral 2.2B,14:52,Wednesday,Poster 2.2,191,Rowan Zellers,\"Rowan Zellers, Yonatan Bisk, Ali Farhadi, Yejin Choi\"\r\n174,Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation,https://github.com/RoyalVane/CLAN,,,,Tuesday,Oral 1.2C,13:35,Tuesday,Poster 1.2,167,Yawei Luo,\"Yawei Luo, Liang Zheng, Tao Guan, Junqing Yu, Yi Yang\"\r\n435,Unsupervised Domain Adaptation Using Feature-Whitening and Consensus Loss,https://github.com/roysubhankar/dwt-domain-adaptation,,,,,,,Thursday,Poster 3.1,54,Subhankar Roy,\r\n224,Regularizing Activation Distribution for Training Binarized Deep Networks,https://github.com/ruizhoud/DistributionLoss,,,,,,,Thursday,Poster 3.2,44,Ruizhou Ding,\"Ruizhou Ding, Ting-Wu Chin, Zeye Liu, Diana Marculescu\"\r\n436,S4Net: Single Stage Salient-Instance Segmentation,https://github.com/RuochenFan/S4Net,,,,,,,Wednesday,Poster 2.1,187,Ruochen Fan,\"Ruochen Fan, Ming-Ming Cheng, Qibin Hou, Tai-Jiang Mu, Jingdong Wang, Shi-Min Hu\"\r\n300,Detecting Overfitting of Deep Generative Networks via Latent Recovery,https://github.com/ryanwebster90/gen-overfitting-latent-recovery,,,,,,,Thursday,Poster 3.2,30,Ryan Webster,\"Ryan Webster,?Julien Rabin,?Loic Simon,?Frederic Jurie\"\r\n26,\"ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network\",https://github.com/sacmehta/ESPNetv2?,,,,,,,Thursday,Poster 3.1,25,Sachin Mehta,\"Sachin Mehta, Mohammad Rastegari, Linda Shapiro, Hannaneh Hajishirzi\"\r\n274,ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging,https://github.com/samarth-robo/contactdb_prediction,,,,Thursday,Oral 3.1B,8:53,Thursday,Poster 3.1,81,Samarth Brahmbhatt,\"Samarth Brahmbhatt, Cusuh Ham, Charles C. Kemp, and James Hays\"\r\n437,Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video,https://github.com/SamvitJ/Accel,,,,Thursday,Oral 3.1C,8:58,Thursday,Poster 3.1,60,Samvit Jain,\r\n255,Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning,https://github.com/SAP/machine-learning-dgm,,,,,,,Thursday,Poster 3.2,35,Oleksiy Ostapenko,\"Oleksiy Ostapenko, Tassilo Klein, Mihai Puscas, Patrick J?hnichen, Moin Nabi\"\r\n438,SOSNet: Second Order Similarity Regularization for Local Descriptor Learning,https://github.com/scape-research/SOSNet,,,,Thursday,Oral 3.2C,13:48,Thursday,Poster 3.2,150,Yurun Tian,\r\n176,DAVANet: Stereo Deblurring With View Aggregation,https://github.com/sczhou/DAVANet,,,,Thursday,Oral 3.2C,13:35,Thursday,Poster 3.2,148,Shangchen Zhou,\"Shangchen Zhou, Jiawei Zhang, Wangmeng Zuo, Haozhe Xie, Jinshan Pan, Jimmy Ren\"\r\n161,WarpGAN: Automatic Caricature Generation,https://github.com/seasonSH/WarpGAN,,,,Thursday,Oral 3.2A,14:52,Thursday,Poster 3.2,15,Yichun Shi,\"Yichun Shi, Debayan Deb, Anil K. Jain\"\r\n88,3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans,https://github.com/Sekunde/3D-SIS,,,,Wednesday,Oral 2.1B,8:30,Wednesday,Poster 2.1,106,Ji Hou,Ji Hou Angela Dai Matthias Nie?ner\r\n301,Understanding and Visualizing Deep Visual Saliency Models,https://github.com/SenHe/uavdvsm,,,,,,,Thursday,Poster 3.1,159,Sen He,\"Sen He, Hamed R. Tavakoli, Ali Borji, Yang Mi, Nicolas Pugeault\"\r\n302,Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach,https://github.com/seqam-lab/GPDA,,,,Wednesday,Oral 2.1A,9:34,Wednesday,Poster 2.1,12,Minyoung Kim,\"Minyoung Kim, Pritish Sahu, Behnam Gholami, Vladimir Pavlovic\"\r\n225,Attention-Based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions,https://github.com/sg-nm/Operation-wise-attention-network,,,,,,,Thursday,Poster 3.1,9,Masanori Suganuma,\"Masanori Suganuma, Xing Liu, Takayuki Okatani\"\r\n439,FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image,https://github.com/shamangary/FSA-Net,,,,,,,Tuesday,Poster 1.1,98,Tsun-Yi Yang,\"Tsun-Yi Yang, Yi-Ting Chen, Yen-Yu Lin, and Yung-Yu Chuang\"\r\n256,Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure,https://github.com/ShawnDing1994/Centripetal-SGD,,,,,,,Wednesday,Poster 2.1,39,Xiaohan Ding,\"Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han\"\r\n440,Wide-Context Semantic Image Extrapolation,https://github.com/shepnerd/outpainting_srn,,,,,,,Tuesday,Poster 1.1,141,Yi Wang,\"Yi Wang, Xin Tao, Xiaoyong Shen, Jiaya Jia.\"\r\n441,Scene Parsing via Integrated Classification Model and Variance-Based Regularization,https://github.com/shihengcan/ICM-matcaffe,,,,,,,Wednesday,Poster 2.1,76,Hengcan Shi,\"Hengcan Shi, Hongliang Li, Qingbo Wu, Zichen Song\"\r\n139,Explainable and Explicit Visual Reasoning Over Scene Graphs,https://github.com/shijx12/XNM-Net,,,,,,,Wednesday,Poster 2.2,202,Jiaxin Shi,\"Jiaxin Shi, Hanwang Zhang, Juanzi Li\"\r\n220,MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation,https://github.com/shuangjiexu/MHP-VOS,,,,Tuesday,Oral 1.1C,9:41,Tuesday,Poster 1.1,116,Shuangjie Xu,\"Shuangjie Xu, Daizong Liu, Linchao Bao, Wei Liu, Pan Zhou\"\r\n153,RePr: Improved Training of Convolutional Filters,https://github.com/siahuat0727/RePr,,,,Thursday,Oral 3.2A,13:53,Thursday,Poster 3.2,5,Aaditya Prakash,\"Aaditya Prakash, James Storer, Dinei Florencio, Cha Zhang\"\r\n510,Unsupervised Person Image Generation With Semantic Parsing Transformation,https://github.com/SijieSong/person_generation_spt,,,,Tuesday,Oral 1.2B,13:53,Tuesday,Poster 1.2,122,Sijie Song,\r\n191,Leveraging Shape Completion for 3D Siamese Tracking,https://github.com/SilvioGiancola/ShapeCompletion3DTracking,,,,,,,Tuesday,Poster 1.1,137,Silvio Giancola,\"Silvio Giancola, Jesus Zarzar, Bernard Ghanem\"\r\n442,Learning Non-Volumetric Depth Fusion Using Successive Reprojections,https://github.com/simon-donne/defusr,,,,,,,Wednesday,Poster 2.2,91,Simon Donne,\"Simon Donn, Andreas Geiger\"\r\n241,Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks,https://github.com/sitzikbs/Nesti-Net,,,,,,,Thursday,Poster 3.1,149,Yizhak Ben-Shabat,\"Yizhak Ben-Shabat, Michael Lindenbaum, Anath Fischer\"\r\n825,On Stabilizing Generative Adversarial Training With Noise,https://github.com/sjenni/dfgan,,,,,,,Thursday,Poster 3.2,137,Simon Jenni,\r\n443,Learning Context Graph for Person Search,https://github.com/sjtuzq/person_search_gcn,,,,Tuesday,Oral 1.2A,13:40,Tuesday,Poster 1.2,20,Yichao Yan,\"Yichao Yan, Qiang Zhang, Bingbing Ni, Wendong Zhang, Minghao Xu, Xiaokang Yang\"\r\n104,Scan2CAD: Learning CAD Model Alignment in RGB-D Scans,https://github.com/skanti/Scan2CAD,,,,Tuesday,Oral 1.2C,14:42,Tuesday,Poster 1.2,178,Armen Avetisyan,\"Armen Avetisyan, Manuel Dahnert, Angela Dai, Manolis Savva, Angel X. Chang, Matthias Nie?ner \"\r\n444,Content-Aware Multi-Level Guidance for Interactive Instance Segmentation,https://github.com/sm176357/mlg,,,,,,,Thursday,Poster 3.2,64,Soumajit Majumder,\r\n226,Attention Based Glaucoma Detection: A Large-Scale Database and CNN Model,https://github.com/smilell/AG-CNN,,,,,,,Thursday,Poster 3.1,211,Liu Li,\"Liu Li, Mai Xu, Xiaofei Wang, Lai Jiang, Hanruo Liu\"\r\n100,Parsing R-CNN for Instance-Level Human Analysis,https://github.com/soeaver/Parsing-R-CNN,,,,,,,Tuesday,Poster 1.1,13,Lu Yang,\"Lu Yang, Qing Song, Zhihui Wang, Ming Jiang\"\r\n275,Dynamics Are Important for the Recognition of Equine Pain in Video,https://github.com/sofiabroome/painface-recognition,,,,,,,Thursday,Poster 3.2,208,Sofia Broome,\"Sofia Broom, Karina Bech Gleerup, Pia Haubro Andersen, Hedvig Kjellstr?m\"\r\n35,A Dataset and Benchmark for Large-Scale Multi-Modal Face Anti-Spoofing,https://github.com/SoftwareGift/FeatherNets_Face-Anti-spoofing-Attack-Detection-Challenge-CVPR2019,,,,,,,Tuesday,Poster 1.1,69,Shifeng Zhang,\"Sachin Mehta, Mohammad Rastegari, Linda Shapiro, Hannaneh Hajishirzi\"\r\n105,Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision,https://github.com/soubhiksanyal/RingNet,,https://ringnet.is.tue.mpg.de/,,,,,Wednesday,Poster 2.2,104,Soubhik Sanyal,\"Soubhik Sanyal, Timo Bolkart, Haiwen Feng, Michael J. Black\"\r\n114,Efficient Parameter-Free Clustering Using First Neighbor Relations,https://github.com/ssarfraz/FINCH-CLustering,,,,Thursday,Oral 3.1C,9:42,Thursday,Poster 3.1,67,Saquib Sarfraz,\"M. Saquib Sarfraz, Vivek Sharma, Rainer Stiefelhagen\"\r\n148,Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation,https://github.com/sshan-zhao/GASDA,,,,,,,Thursday,Poster 3.1,116,Shanshan Zhao,\"Shanshan Zhao, Huan Fu, Mingming Gong, Dacheng Tao\"\r\n33,PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud,https://github.com/sshaoshuai/PointRCNN,,,,,,,Tuesday,Poster 1.1,54,Shaoshuai Shi,\"Shaoshuai Shi, Xiaogang Wang, Hongsheng Li\"\r\n44,4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks,https://github.com/StanfordVL/MinkowskiEngine,,,,,,,Tuesday,Poster 1.2,60,Christopher Choy,\"Christopher Choy, JunYoung Gwak, Silvio Savarese\"\r\n288,Generative Dual Adversarial Network for Generalized Zero-Shot Learning,https://github.com/stevehuanghe/GDAN,,,,,,,Tuesday,Poster 1.1,57,He Huang,\"He Huang, Changhu Wang, Philip S. Yu, Chang-Dong Wang\"\r\n140,Spatial Attentive Single-Image Deraining With a High Quality Real Rain Dataset,https://github.com/stevewongv/SPANet,,https://stevewongv.github.io/derain-project.html,,,,,Thursday,Poster 3.2,168,Tianyu Wang,\"Tianyu Wang*, Xin Yang*, Ke Xu, Shaozhe Chen, Qiang Zhang, Rynson Lau\"\r\n141,Aggregation Cross-Entropy for Sequence Recognition,https://github.com/summerlvsong/Aggregation-Cross-Entropy,,,,Wednesday,Oral 2.2A,14:52,Wednesday,Poster 2.2,39,Zecheng Xie,\"Zecheng Xie, Yaoxiong Huang, Yuanzhi Zhu, Lianwen Jin, Yuliang Liu, Lele Xie\"\r\n1211,Deeply-Supervised Knowledge Synergy,https://github.com/sundw2014/DKS,,,,,,,Wednesday,Poster 2.2,7,Dawei Sun,\r\n445,Leveraging Crowdsourced GPS Data for Road Extraction From Aerial Imagery,https://github.com/suniique/Leveraging-Crowdsourced-GPS-Data-for-Road-Extraction-from-Aerial-Imagery?,,,,,,,Wednesday,Poster 2.2,78,Tao Sun,\"Tao Sun, Zonglin Di, Pengyu Che, Chun Liu, Yin Wang\"\r\n138,HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch Data Augmentation,https://github.com/sunset1995/HorizonNet,,https://sunset1995.github.io/HorizonNet/,,,,,Tuesday,Poster 1.1,94,Cheng Sun,\"Cheng Sun, Chi-Wei Hsiao, Min Sun, Hwann-Tzong Chen \"\r\n94,Learning Linear Transformations for Fast Image and Video Style Transfer,https://github.com/sunshineatnoon/LinearStyleTransfer,,https://sites.google.com/view/linear-style-transfer-cvpr19/,,,,,Tuesday,Poster 1.2,152,Xueting Li,\"Xueting Li, Sifei Liu, Jan Kautz, Ming-Hsuan Yang\"\r\n73,Single-Image Piece-Wise Planar 3D Reconstruction via Associative Embedding,https://github.com/svip-lab/PlanarReconstruction,,,,,,,Tuesday,Poster 1.1,92,Zehao Yu,\"Zehao Yu, Jia Zheng, Dongze Lian, Zihan Zhou, Shenghua Gao\"\r\n227,PPGNet: Learning Point-Pair Graph for Line Segment Detection,https://github.com/svip-lab/PPGNet,,,,,,,Wednesday,Poster 2.2,18,Ziheng Zhang,\"Ziheng Zhang, Zhengxin Li, Ning Bi, Jia Zheng, Jinlei Wang, Kun Huang, Weixin Luo, Yanyu Xu, Shenghua Gao\"\r\n17,\"DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images\",https://github.com/switchablenorms/DeepFashion2,,,,,,,Wednesday,Poster 2.1,79,Yuying Ge,\"Yuying Ge, Ruimao Zhang, Lingyun Wu, Xiaogang Wang, Xiaoou Tang, Ping Luo\"\r\n168,SSN: Learning Sparse Switchable Normalization via SparsestMax,https://github.com/switchablenorms/Sparse_SwitchNorm,,,,,,,Tuesday,Poster 1.1,21,Wenqi Shao,\"Wenqi Shao, Tianjian Meng, Jingyu Li, Ruimao Zhang, Yudian Li, Xiaogang Wang, Ping Luo \"\r\n446,Dissecting Person Re-Identification From the Viewpoint of Viewpoint,https://github.com/sxzrt/Dissecting-Person-Re-ID-from-the-Viewpoint-of-Viewpoint,,,,,,,Tuesday,Poster 1.1,38,Xiaoxiao Sun,\"Ting Zhao, Xiangqian Wu\"\r\n447,Image-To-Image Translation via Group-Wise Deep Whitening-And-Coloring Transformation,https://github.com/taki0112/GDWCT-Tensorflow,,,,Thursday,Oral 3.2A,13:35,Thursday,Poster 3.2,2,Wonwoong Cho,\"Wonwoong Cho, Sungha Choi, David Keetae Park, Inkyu Shin, Jaegul Choo\"\r\n448,Sphere Generative Adversarial Network Based on Geometric Moment Matching,https://github.com/taki0112/SphereGAN-Tensorflow?,,,,Wednesday,Oral 2.1A,8:40,Wednesday,Poster 2.1,3,Sung Woo Park,Sung Woo Park and Junseok Kwon\r\n578,Label-Noise Robust Generative Adversarial Networks,https://github.com/takuhirok/rGAN,,,,Tuesday,Oral 1.2B,15:00,Tuesday,Poster 1.2,133,Takuhiro Kaneko,\r\n617,A Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images,https://github.com/tangjiapeng/SkeletonBridgeRecon,,,,Wednesday,Oral 2.1B,9:42,Wednesday,Poster 2.1,118,Jiapeng Tang,\r\n289,Defending Against Adversarial Attacks by Randomized Diversification,https://github.com/taranO/defending-adversarial-attacks-by-RD,,,,,,,Thursday,Poster 3.2,25,Olga Taran,\"O. Taran, S. Rezaeifar, T. Holotyak, and S. Voloshynovskiy\"\r\n257,Model-Blind Video Denoising via Frame-To-Frame Training,https://github.com/tehret/blind-denoising,,,,,,,Thursday,Poster 3.2,40,Thibaud Ehret,\"Thibaud Ehret, Axel Davy, Jean-Michel Morel, Gabriele Facciolo, Pablo Arias\"\r\n8,DSFD: Dual Shot Face Detector,https://github.com/TencentYoutuResearch/FaceDetection-DSFD,https://mp.weixin.qq.com/s/0rTCeHumVSv07hMCaCd7EA,,,,,,Wednesday,Poster 2.1,51,Jian Li,\"Jian Li, Yabiao Wang, Changan Wang, Ying Tai\"\r\n47,Progressive Pose Attention Transfer for Person Image Generation,https://github.com/tengteng95/Pose-Transfer,,,,Tuesday,Oral 1.2B,13:48,Tuesday,Poster 1.2,121,Zhen Zhu,\"Zhen Zhu, Tengteng Huang, Baoguang Shi, Miao Yu, Bofei Wang, Xiang Bai\"\r\n1,Detect-To-Retrieve: Efficient Regional Aggregation for Image Search,https://github.com/tensorflow/models/tree/master/research/delf,,,Retrieve,,,,Wednesday,Poster 2.1,56,Marvin Teichmann,\"Marvin Teichmann, Andre Araujo, Menglong Zhu, Jack Sim\"\r\n1247,FEELVOS: Fast End-To-End Embedding Learning for Video Object Segmentation,https://github.com/tensorflow/models/tree/master/research/feelvos,,,,,,,Thursday,Poster 3.1,70,Paul Voigtlaender,\r\n27,MnasNet: Platform-Aware Neural Architecture Search for Mobile,https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet,,,,,,,Tuesday,Poster 1.2,16,Mingxing Tan,\"Mingxing Tan,?Bo Chen,?Ruoming Pang,?Vijay Vasudevan,?Mark Sandler,?Andrew Howard,?Quoc V. Le\"\r\n122,Learning to Reconstruct People in Clothing From a Single RGB Camera,https://github.com/thmoa/octopus,,,,,,,Tuesday,Poster 1.1,107,Thiemo Alldieck,\"Thiemo Alldieck, Marcus Magnor, Bharat Lal Bhatnagar, Christian Theobalt, Gerard Pons-Moll \"\r\n449,Learning to Transfer Examples for Partial Domain Adaptation,https://github.com/thuml/ETN,,,,,,,Tuesday,Poster 1.2,51,Zhangjie Cao,\"Zhangjie Cao, Kaichao You, Mingsheng Long, Jianmin Wang, Qiang Yang\"\r\n450,Universal Domain Adaptation,https://github.com/thuml/Universal-Domain-Adaptation?,,,,,,,Tuesday,Poster 1.2,6,Kaichao You,\"Kaichao You, Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael . Jordan\"\r\n451,Multispectral Imaging for Fine-Grained Recognition of Powders on Complex Backgrounds,https://github.com/tiancheng-zhi/ms-powder,,http://www.cs.cmu.edu/~ILIM/projects/IM/MSPowder/?,,Thursday,Oral 3.1B,8:48,Thursday,Poster 3.1,80,Tiancheng Zhi,\"Tiancheng Zhi, Bernardo R. Pires, Martial Hebert and Srinivasa G. Narasimhan\"\r\n290,On the Continuity of Rotation Representations in Neural Networks,https://github.com/tik0/6d,,,,,,,Wednesday,Poster 2.1,150,Yi Zhou,\"Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, Hao Li\"\r\n126,\"Capture, Learning, and Synthesis of 3D Speaking Styles\",https://github.com/TimoBolkart/voca,,,,,,,Thursday,Poster 3.1,148,Daniel Cudeiro,\"Daniel Cudeiro, Timo Bolkart, Cassidy Laidlaw, Anurag Ranjan, Michael J. Black\"\r\n585,Deep Metric Learning Beyond Binary Supervision,https://github.com/tjddus9597/Beyond-Binary-Supervision-CVPR19,,,,Tuesday,Oral 1.2A,15:00,Tuesday,Poster 1.2,33,Sungyeon Kim,\r\n571,Probabilistic Permutation Synchronization Using the Riemannian Structure of the Birkhoff Polytope,https://github.com/tolgabirdal/birkhoffrlmc,,,,Thursday,Oral 3.2C,14:42,Thursday,Poster 3.2,159,Tolga Birdal,Tolga Birdal & Umut _im_ekli\r\n281,On Zero-Shot Recognition of Generic Objects,https://github.com/TristHas/GOZ,,,,,,,Thursday,Poster 3.1,92,Tristan Hascoet,\"Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi\"\r\n276,Distilling Object Detectors With Fine-Grained Feature Imitation,https://github.com/twangnh/Distilling-Object-Detectors,,,,,,,Wednesday,Poster 2.1,38,Tao Wang,Tao WangLi YuanXiaopeng ZhangJiashi Feng\r\n215,Elastic Boundary Projection for 3D Medical Image Segmentation,https://github.com/twni2016/Elastic-Boundary-Projection,,,,,,,Tuesday,Poster 1.1,213,Tianwei Ni,\"Tianwei Ni, Lingxi Xie, Huangjie Zheng, Elliot K. Fishman, Alan L. Yuille\"\r\n228,Reversible GANs for Memory-Efficient Image-To-Image Translation,https://github.com/tychovdo/RevGAN,,,,,,,Wednesday,Poster 2.1,16,Tycho F.A. van der Ouderaa,\"Tycho F.A. van der Ouderaa, Daniel E. Worrall\"\r\n452,Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks,https://github.com/tyohei/chainerkfac,,,,,,,Thursday,Poster 3.2,177,Kazuki Osawa,\"Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, and Satoshi Matsuoka\"\r\n48,UPSNet: A Unified Panoptic Segmentation Network,https://github.com/uber-research/UPSNet,,,,Thursday,Oral 3.1C,8:30,Thursday,Poster 3.1,55,Yuwen Xiong,\"uwen Xiong, Renjie Liao, Hengshuang Zhao, Rui Hu, Min Bai, Ersin Yumer, Raquel Urtasun\"\r\n131,Hierarchical Discrete Distribution Decomposition for Match Density Estimation,https://github.com/ucbdrive/hd3,,,,,,,Wednesday,Poster 2.1,181,Zhichao Yin,\"Zhichao Yin, Trevor Darrell, Fisher Yu\"\r\n453,TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning,https://github.com/ucbdrive/tafe-net,,,,,,,Tuesday,Poster 1.1,185,Xin Wang,\"Xin Wang, Fisher Yu, Ruth Wang, Trevor Darrell, Joseph E. Gonzalez\"\r\n454,Text2Scene: Generating Compositional Scenes From Textual Descriptions,https://github.com/uvavision/Text2Image?,,,,Wednesday,Oral 2.2B,14:47,Wednesday,Poster 2.2,190,Fuwen Tan,\"Yitong Li, Zhe Gan, Yelong Shen, Jingjing Liu, Yu Cheng, Yuexin Wu, Lawrence Carin, David Carlson, Jianfeng Gao\"\r\n209,ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation,https://github.com/valeoai/ADVENT,,,,Tuesday,Oral 1.2C,13:40,Tuesday,Poster 1.2,168,Tuan-Hung Vu,\"Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, Patrick Prez\"\r\n154,Single Image Reflection Removal Exploiting Misaligned Training Data and Network Enhancements,https://github.com/Vandermode/ERRNet,,,,,,,Wednesday,Poster 2.2,164,Kaixuan Wei,\"Kaixuan Wei, Jiaolong Yang, Ying Fu, David Wipf, Hua Huang \"\r\n455,\"Expressive Body Capture: 3D Hands, Face, and Body From a Single Image\",https://github.com/vchoutas/smplify-x,,https://smpl-x.is.tue.mpg.de/?,,Thursday,Oral 3.2B,15:10,Thursday,Poster 3.2,109,Georgios Pavlakos,\"Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, Michael J. Black\"\r\n456,C3AE: Exploring the Limits of Compact Model for Age Estimation,https://github.com/vicwer/C3AE_Age_Estimation,,,,,,,Thursday,Poster 3.2,200,Chao Zhang,\"Chao Zhang, Shuaicheng Liu, Xun Xu, Ce Zhu\"\r\n258,Weakly Supervised Deep Image Hashing Through Tag Embeddings,https://github.com/Vijetha1/WDHT,,,,,,,Thursday,Poster 3.1,176,Vijetha Gattupalli,\"Vijetha Gattupalli, Yaoxin Zhuo, Baoxin Li\"\r\n291,Actor-Critic Instance Segmentation,https://github.com/visinf/acis,,,,,,,Wednesday,Poster 2.2,170,Nikita Araslanov,\"Kwang In Kim, Hyung Jin Chang\"\r\n457,Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation,https://github.com/visinf/irr,,,,,,,Wednesday,Poster 2.1,151,Junhwa Hur,Junhwa Hur and Stefan Roth\r\n99,Strong-Weak Distribution Alignment for Adaptive Object Detection,https://github.com/VisionLearningGroup/DA_Detection,,,,,,,Wednesday,Poster 2.2,3,Kuniaki Saito,\"Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, Kate Saenko\"\r\n22,ATOM: Accurate Tracking by Overlap Maximization,https://github.com/visionml/pytracking,,,,Wednesday,Oral 2.1C,9:24,Wednesday,Poster 2.1,144,Martin Danelljan,\"Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg\"\r\n458,MOTS: Multi-Object Tracking and Segmentation,https://github.com/VisualComputingInstitute/mots_tools,,,,,,,Wednesday,Poster 2.2,122,Paul Voigtlaender,\r\n45,PifPaf: Composite Fields for Human Pose Estimation,https://github.com/vita-epfl/openpifpaf,,,,,,,Thursday,Poster 3.2,120,Sven Kreiss,\"Sven Kreiss, Lorenzo Bertoni, Alexandre Alahi\"\r\n459,Mixed Effects Neural Networks (MeNets) With Applications to Gaze Estimation,https://github.com/vsingh-group/MeNets,,,,,,,Wednesday,Poster 2.2,102,Yunyang Xiong,\r\n175,Curls & Whey: Boosting Black-Box Adversarial Attacks,https://github.com/walegahaha/Curls-Whey,,,,Wednesday,Oral 2.2A,14:42,Wednesday,Poster 2.2,37,Yucheng Shi,\"Yucheng Shi, Siyu Wang, Yahong Han\"\r\n115,Adaptively Connected Neural Networks,https://github.com/wanggrun/Adaptively-Connected-Neural-Networks,,,,,,,Tuesday,Poster 1.1,180,Guangrun Wang,\"Guangrun Wang, Keze Wang, Liang Lin\"\r\n513,Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes,https://github.com/wangxiaogang866/Shape2Motion,,,,Thursday,Oral 3.1C,9:06,Thursday,Poster 3.1,61,Xiaogang Wang,\r\n460,Interactive Image Segmentation via Backpropagating Refinement Scheme,https://github.com/wdjang/BRS-Interactive_segmentation,,,,,,,Wednesday,Poster 2.1,75,Won-Dong Jang,Won-Dong Jang and Chang-Su Kim\r\n250,Iterative Reorganization With Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning,https://github.com/weichen582/Unsupervised-Visual-Recognition-by-Solving-Arbitrary-Puzzles,,,,,,,Tuesday,Poster 1.1,193,Chen Wei,\"Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiaying Liu, Qi Tian, Alan L. Yuille \"\r\n221,Conditional Single-View Shape Generation for Multi-View Stereo Reconstruction,https://github.com/weiyithu/OptimizeMVS,,,,,,,Thursday,Poster 3.1,102,Yi Wei,\"Yi Wei, Shaohui Liu, Wang Zhao, Jiwen Lu, Jie Zhou\"\r\n461,Revisiting Local Descriptor Based Image-To-Class Measure for Few-Shot Learning,https://github.com/WenbinLee/DN4.git,,,,,,,Wednesday,Poster 2.2,52,Wenbin Li,\"Wenbin Li, Lei Wang, Jinglin Xu, Jing Huo, Yang Gao, Jiebo Luo\"\r\n777,Learning Unsupervised Video Object Segmentation Through Visual Attention,https://github.com/wenguanwang/AGS,,,,,,,Tuesday,Poster 1.2,59,Wenguan Wang,\r\n462,Adapting Object Detectors via Selective Cross-Domain Alignment,https://github.com/WERush/SCDA,,,,,,,Tuesday,Poster 1.1,46,Xinge Zhu,\"Zhu, Xinge and Pang, Jiangmiao and Yang, Ceyuan and Shi, Jianping and Lin, Dahua\"\r\n24,Shape Robust Text Detection With Progressive Scale Expansion Network,https://github.com/whai362/PSENet,,,,,,,Thursday,Poster 3.1,40,Wenhai Wang,\"Xiang Li, Wenhai Wang, Wenbo Hou, Ruo-Ze Liu, Tong Lu, Jian Yang\"\r\n498,AE2-Nets: Autoencoder in Autoencoder Networks,https://github.com/willow617/AE2-Nets,,,,Tuesday,Oral 1.2C,14:16,Tuesday,Poster 1.2,174,Changqing Zhang,Changqing Zhang Yeqing Liu Huazhu Fu\r\n229,C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection,https://github.com/Winfrand/C-MIL,,,,Tuesday,Oral 1.2A,14:06,Tuesday,Poster 1.2,24,Fang Wan,\"Fang Wan, Chang Liu, Wei Ke, Xiangyang Ji, Jianbin Jiao, Qixiang Ye\"\r\n204,Learning Actor Relation Graphs for Group Activity Recognition,https://github.com/wjchaoGit/Group-Activity-Recognition,,,,,,,Thursday,Poster 3.1,134,Jianchao Wu,\"Jianchao Wu, Limin Wang, Li Wang, Jie Guo, Gangshan Wu\"\r\n463,Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations,https://github.com/wonheeML/mtl-ssl,,,,,,,Wednesday,Poster 2.1,43,Wonhee Lee,\r\n149,Cascaded Partial Decoder for Fast and Accurate Salient Object Detection,https://github.com/wuzhe71/CPD,,,,,,,Tuesday,Poster 1.2,162,Zhe Wu,\"Zhe Wu, Li Su, Qingming Huang\"\r\n65,Associatively Segmenting Instances and Semantics in Point Clouds,https://github.com/WXinlong/ASIS,,,,,,,Tuesday,Poster 1.2,199,Xinlong Wang,\"Xinlong Wang, Shu Liu, Xiaoyong Shen, Chunhua Shen, Jiaya Jia\"\r\n77,Hardness-Aware Deep Metric Learning,https://github.com/wzzheng/HDML,,,,Tuesday,Oral 1.1A,9:41,Tuesday,Poster 1.1,8,Wenzhao Zheng,\"Wenzhao Zheng, Zhaodong Chen, Jiwen Lu, Jie Zhou \"\r\n67,Data Augmentation Using Learned Transformations for One-Shot Medical Image Segmentation,https://github.com/xamyzhao/brainstorm,,,,Thursday,Oral 3.1A,8:40,Thursday,Poster 3.1,192,Amy Zhao,\"Amy Zhao,?Guha Balakrishnan,?Frdo Durand,?John V. Guttag,?Adrian V. Dalca\"\r\n464,Synthesizing Environment-Aware Activities via Activity Sketches,https://github.com/xavierpuigf/virtualhome,,,,,,,Wednesday,Poster 2.1,206,Yuan-Hong Liao,\"A. Liao*, X. Puig*, M. Boben, A. Torralba, S. Fidler\"\r\n465,Generating 3D Adversarial Point Clouds,https://github.com/xiangchong1/3d-adv-pc,,,,,,,Thursday,Poster 3.1,19,Chong Xiang,\"Chong Xiang, Charles R. Qi , Bo Li \"\r\n142,Learning Transformation Synchronization,https://github.com/xiangruhuang/Learning2Sync,,,,,,,Wednesday,Poster 2.2,154,Xiangru Huang,\"Xiangru Huang, Zhenxiao Liang, Xiaowei Zhou, Yao Xie, Leonidas Guibas, and Qixing Huang\"\r\n466,Self-Supervised Learning via Conditional Motion Propagation,https://github.com/XiaohangZhan/conditional-motion-propagation?,,,,,,,Tuesday,Poster 1.1,190,Xiaohang Zhan,\"Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, Chen Change Loy\"\r\n72,AOGNets: Compositional Grammatical Architectures for Deep Learning,https://github.com/xilaili/AOGNet,,,,,,,Wednesday,Poster 2.1,199,Xilai Li,\"Xilai Li,?Xi Song,?Tianfu Wu\"\r\n18,Bottom-Up Object Detection by Grouping Extreme and Center Points,https://github.com/xingyizhou/ExtremeNet,,,,,,,Tuesday,Poster 1.1,62,Xingyi Zhou,\"Xingyi Zhou, Jiacheng Zhuo, Philipp Kr?henbhl\"\r\n467,Learning Video Representations From Correspondence Proposals,https://github.com/xingyul/cpnet,,,,Wednesday,Oral 2.1A,8:30,Wednesday,Poster 2.1,1,Xingyu Liu,\"Xingyu Liu, Joon-Young Lee and Hailin Jin\"\r\n468,FlowNet3D: Learning Scene Flow in 3D Point Clouds,https://github.com/xingyul/flownet3d,,,,,,,Tuesday,Poster 1.1,30,Xingyu Liu,\"Xingyu Liu, Charles R. Qi, Leonidas J. Guibas \"\r\n469,Target-Aware Deep Tracking,https://github.com/XinLi-zn/TADT,,https://xinli-zn.github.io/TADT-project-page/,,,,,Tuesday,Poster 1.1,138,Xin Li,\"Xin Li, Chao Ma, Baoyuan Wu, Zhenyu He, Ming-Hsuan Yang\"\r\n1242,Deep Network Interpolation for Continuous Imagery Effect Transition,https://github.com/xinntao/DNI,,,,,,,Tuesday,Poster 1.1,171,Xintao Wang,\r\n470,Ranked List Loss for Deep Metric Learning,https://github.com/XinshaoAmosWang/Deep-Metric-Embedding,,,,,,,Wednesday,Poster 2.1,66,Xinshao Wang,\r\n267,PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval,https://github.com/XLechter/PCAN,,,,,,,Thursday,Poster 3.2,185,Wenxiao Zhang,\"Wenxiao Zhang, Chunxia Xiao\"\r\n277,IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition,https://github.com/xpwu95/IP102,,,,Thursday,Oral 3.1B,9:42,Thursday,Poster 3.1,89,Xiaoping Wu,Xiaoping WuChi ZhanYu-Kun LaiMing-Ming ChengJufeng Yang\r\n169,\"Rob-GAN: Generator, Discriminator, and Adversarial Attacker\",https://github.com/xuanqing94/RobGAN,,,,,,,Thursday,Poster 3.2,26,Xuanqing Liu,\"Xuanqing Liu, Cho-Jui Hsieh\"\r\n471,Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction,https://github.com/xudejing/VCOP,,,,,,,Thursday,Poster 3.1,172,Dejing Xu,Dejing XuJun XiaoZhou ZhaoJian ShaoDi XieYueting Zhuang\r\n472,Meta-SR: A Magnification-Arbitrary Network for Super-Resolution,https://github.com/XuecaiHu/Meta-SR-Pytorch,,,,,,,Tuesday,Poster 1.1,159,Xuecai Hu,\"Xuecai Hu, Haoyuan Mu, Xiangyu Zhang, Zilei Wang, Tieniu Tan, Jian Sun\"\r\n473,Depth-Attentional Features for Single-Image Rain Removal,https://github.com/xw-hu/DAF-Net,,,,,,,Wednesday,Poster 2.2,148,Xiaowei Hu,\"Xiaowei Hu, Chi-Wing Fu, Lei Zhu, and Pheng-Ann Heng\"\r\n102,Group-Wise Correlation Stereo Network,https://github.com/xy-guo/GwcNet,,,,,,,Tuesday,Poster 1.2,80,Xiaoyang Guo,\"Xiaoyang Guo, Kai Yang, Wukui Yang, Xiaogang Wang, Hongsheng Li \"\r\n133,Meta-Transfer Learning for Few-Shot Learning,https://github.com/y2l/meta-transfer-learning-tensorflow,,,,,,,Tuesday,Poster 1.1,17,Qianru Sun,\"Qianru Sun, Yaoyao Liu, Tat-Seng Chua, Bernt Schiele\"\r\n199,MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation,https://github.com/yabufarha/ms-tcn,,,,,,,Tuesday,Poster 1.2,110,Yazan Abu Farha,\"Yazan Abu Farha, Juergen Gall\"\r\n474,Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval,https://github.com/yalesong/pvse,,,,,,,Tuesday,Poster 1.1,200,Yale Song,\r\n519,Auto-Encoding Scene Graphs for Image Captioning,https://github.com/yangxuntu/SGAE,,,,Thursday,Oral 3.2A,14:06,Thursday,Poster 3.2,7,Xu Yang,\r\n292,On Finding Gray Pixels,https://github.com/yanlinqian/Grayness-Index,,,,,,,Wednesday,Poster 2.2,152,Yanlin Qian,\"Yanlin Qian, Joni-Kristian K?m?r?inen, Jarno Nikkanen, Jiri Matas\"\r\n124,Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph,https://github.com/yaohungt/GSTEG_CVPR_2019,,,,,,,Thursday,Poster 3.1,181,Yao-Hung Hubert Tsai,\"Yao-Hung Hubert Tsai, Santosh Divvala, Louis-Philippe Morency, Ruslan Salakhutdinov, Ali Farhadi\"\r\n200,What Object Should I Use? - Task Driven Object Detection,https://github.com/yassersouri/task-driven-object-detection,,,,,,,Wednesday,Poster 2.2,88,Johann Sawatzky,\"Johann Sawatzky, Yaser Souri, Christian Grund, Juergen Gall\"\r\n475,InverseRenderNet: Learning Single Image Inverse Rendering,https://github.com/YeeU/InverseRenderNet,,,,,,,Tuesday,Poster 1.2,68,Ye Yu,\"Yu, Ye and Smith, William AP\"\r\n143,Patch-Based Progressive 3D Point Set Upsampling,https://github.com/yifita/3PU,,,,,,,Wednesday,Poster 2.1,172,Wang Yifan,\"Wang Yifan, Shihao Wu, Hui Huang, Daniel Cohen-Or, Olga Sorkine-Hornung\"\r\n32,Bounding Box Regression With Uncertainty for Accurate Object Detection,https://github.com/yihui-he/KL-Loss,https://mp.weixin.qq.com/s/1q_DZ_-WswjAsS1qjxvLNg?,,,,,,Tuesday,Poster 1.2,41,Yihui He,\"Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, Xiangyu Zhang\"\r\n476,Im2Pencil: Controllable Pencil Illustration From Photographs,https://github.com/Yijunmaverick/Im2Pencil?,,,,,,,Tuesday,Poster 1.1,154,Yijun Li,\"Yijun Li,Chen Fang, Aaron Hertzmann, Eli Shechtman, Ming-Hsuan Yang\"\r\n477,Probabilistic End-To-End Noise Correction for Learning With Noisy Labels,https://github.com/yikun2019/PENCIL,,,,,,,Wednesday,Poster 2.2,9,Kun Yi,\r\n478,Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation,https://github.com/yingcong/HomoInterpGAN,,,,Tuesday,Oral 1.2B,14:24,Tuesday,Poster 1.2,127,Ying-Cong Chen,\r\n303,Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes,https://github.com/yiranzhong/EPIflow.git,,,,,,,Thursday,Poster 3.2,132,Yiran Zhong,\"Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li\"\r\n479,StoryGAN: A Sequential Conditional GAN for Story Visualization,https://github.com/yitong91/StoryGAN,,,,,,,Wednesday,Poster 2.1,210,Yitong Li,\"Yitong Li, Zhe Gan, Yelong Shen, Jingjing Liu, Yu Cheng, Yuexin Wu, Lawrence Carin, David Carlson, Jianfeng Gao\"\r\n480,Learning to Cluster Faces on an Affinity Graph,https://github.com/yl-1993/learn-to-cluster?,,,,Tuesday,Oral 1.2A,15:05,Tuesday,Poster 1.2,34,Lei Yang,\"Lei Yang, Xiaohang Zhan, Dapeng Chen, Junjie Yan, Chen Change Loy, Dahua Lin\"\r\n95,Relation-Shape Convolutional Neural Network for Point Cloud Analysis,https://github.com/Yochengliu/Relation-Shape-CNN,https://mp.weixin.qq.com/s/KuoHFPpUYHPFacsBWZWVSg,https://yochengliu.github.io/Relation-Shape-CNN/,,Thursday,Oral 3.1C,9:16,Thursday,Poster 3.1,63,Yongcheng Liu,\"Yongcheng Liu, Bin Fan, Shiming Xiang, Chunhong Pan\"\r\n178,3D Point Capsule Networks,https://github.com/yongheng1991/3D-point-capsule-networks,,,,,,,Tuesday,Poster 1.1,90,Yongheng Zhao,\"Yongheng Zhao, Tolga Birdal, Haowen Deng, Federico Tombari \"\r\n29,Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference,https://github.com/YoYo000/MVSNet,,,,,,,Wednesday,Poster 2.1,98,Yao Yao,\"Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan\"\r\n481,Side Window Filtering,https://github.com/YuanhaoGong/SideWindowFilter ,,,,Thursday,Oral 3.1B,9:24,Thursday,Poster 3.1,86,Hui Yin,\"Hui Yin, Yuanhao Gong, Guoping Qiu\"\r\n86,Fast Human Pose Estimation,https://github.com/yuanyuanli85/Fast_Human_Pose_Estimation_Pytorch,,,,,,,Tuesday,Poster 1.2,104,Feng Zhang,\"Feng Zhang, Xiatian Zhu, Mao Ye\"\r\n259,Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression,https://github.com/yuchaoli/KSE,,,,,,,Tuesday,Poster 1.2,14,Yuchao Li,\"Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji\"\r\n482,Learning Semantic Segmentation From Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach,https://github.com/yuhuayc/gio-ada,,,,,,,Tuesday,Poster 1.1,186,Yuhua Chen,\r\n268,RENAS: Reinforced Evolutionary Neural Architecture Search,https://github.com/yukang2017/RENAS,,,,,,,Wednesday,Poster 2.1,23,Yukang Chen,\"Yukang Chen, Gaofeng Meng, Qian Zhang, Shiming Xiang, Chang Huang, Lisen Mu, Xinggang Wang\"\r\n260,DeepFlux for Skeletons in the Wild,https://github.com/YukangWang/DeepFlux,,,,,,,Wednesday,Poster 2.1,74,Yukang Wang,\"Yukang Wang, Yongchao Xu, Stavros Tsogkas, Xiang Bai, Sven Dickinson, Kaleem Siddiqi\"\r\n483,Recursive Visual Attention in Visual Dialog,https://github.com/yuleiniu/rva,,,,Wednesday,Oral 2.2B,14:29,Wednesday,Poster 2.2,187,Yulei Niu,\"Yulei Niu, Hanwang Zhang, Manli Zhang, Jianhong Zhang, Zhiwu Lu, Ji-Rong Wen\"\r\n97,Tightness-Aware Evaluation Protocol for Scene Text Detection,https://github.com/Yuliang-Liu/TIoU-metric,,,,,,,Thursday,Poster 3.1,98,Yuliang Liu,\"Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie \"\r\n158,Learning Metrics From Teachers: Compact Networks for Image Embedding,https://github.com/yulu0724/EmbeddingDistillation,,,,,,,Tuesday,Poster 1.2,43,Lu Yu,\"Lu Yu, Vacit Oguz Yazici, Xialei Liu, Joost van de Weijer, Yongmei Cheng, Arnau Ramisa\"\r\n210,Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics,https://github.com/Yunbo426/MIM,,,,,,,Thursday,Poster 3.1,21,Yunbo Wang,\"Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long, Jianmin Wang, Philip S Yu\"\r\n484,Connecting Touch and Vision via Cross-Modal Prediction,https://github.com/YunzhuLi/VisGel,,http://visgel.csail.mit.edu/,,,,,Thursday,Poster 3.1,215,Yunzhu Li,\"Yunzhu Li,Jun-Yan Zhu,Russ Tedrake,and Antonio Torralba\"\r\n171,Knowledge-Embedded Routing Network for Scene Graph Generation,https://github.com/yuweihao/KERN,,,,,,,Wednesday,Poster 2.1,193,Tianshui Chen,\"Tianshui Chen, Weihao Yu, Riquan Chen, Liang Lin \"\r\n183,Fast Single Image Reflection Suppression via Convex Optimization,https://github.com/yyhz76/reflectSuppress,,,,,,,Wednesday,Poster 2.2,160,Yang Yang,\"Yang Yang, Wenye Ma, Yin Zheng, Jian-Feng Cai, Weiyu Xu \"\r\n485,Path-Invariant Map Networks,https://github.com/zaiweizhang/path_invariance_map_network,,,,Thursday,Oral 3.2C,14:29,Thursday,Poster 3.2,157,Zaiwei Zhang,\"Zaiwei Zhang, Zhenxiao Liang, Lemeng Wu, Xiaowei Zhou and Qixing Huang\"\r\n238,Multi-Source Weak Supervision for Saliency Detection,https://github.com/zengxianyu/mws,,,,,,,Wednesday,Poster 2.1,184,Yu Zeng,\"Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang, Mingyang Qian, Yizhou Yu \"\r\n760,Triangulation Learning Network: From Monocular to Stereo 3D Object Detection,https://github.com/Zengyi-Qin/TLNet,,,,,,,Wednesday,Poster 2.2,89,Zengyi Qin,\r\n193,A Bayesian Perspective on the Deep Image Prior,https://github.com/ZezhouCheng/GP-DIP,,,,,,,Wednesday,Poster 2.1,90,Zezhou Cheng,\"Zezhou Cheng, Matheus Gadelha, Subhransu Maji, Daniel Sheldon\"\r\n486,CRAVES: Controlling Robotic Arm With a Vision-Based Economic System,https://github.com/zfw1226/craves_control,,,,,,,Tuesday,Poster 1.2,211,Yiming Zuo,\"Zuo, Yiming and Qiu, Weichao and Xie, Lingxi and Zhong, Fangwei and Wang, Yizhou and Yuille, Alan L\"\r\n136,The Perfect Match: 3D Point Cloud Matching With Smoothed Densities,https://github.com/zgojcic/3DSmoothNet,,,,,,,Wednesday,Poster 2.1,100,Zan Gojcic,\"Zan Gojcic, Caifa Zhou, Jan D. Wegner, Andreas Wieser\"\r\n487,CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection,https://github.com/zhangludl/code-and-dataset-for-CapSal?,,,,,,,Wednesday,Poster 2.1,179,Lu Zhang,Lu Zhang; Huchuan Lu ; Zhe Lin ; Jianming Zhang; You He\r\n488,Image Generation From Layout,https://github.com/zhaobozb/layout2im,,,,Thursday,Oral 3.1A,9:06,Thursday,Poster 3.1,196,Bo Zhao,\"Bo Zhao, Lili Meng,Weidong Yin, Leonid Sigal\"\r\n489,Learning Spatio-Temporal Representation With Local and Global Diffusion,https://github.com/ZhaofanQiu/local-and-global-diffusion-networks,,,,,,,Thursday,Poster 3.2,128,Zhaofan Qiu,\r\n230,Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers,https://github.com/zhen-he/tracking-by-animation,,,,,,,Tuesday,Poster 1.1,133,Zhen He,\"Zhen He, Jian Li, Daxue Liu, Hangen He, David Barber\"\r\n278,Activity Driven Weakly Supervised Object Detection,https://github.com/zhenheny/ADWSOD,,,,,,,Tuesday,Poster 1.2,44,Zhenheng Yang,\"Zhenheng Yang, Dhruv Mahajan, Deepti Ghadiyaram, Ram Nevatia, Vignesh Ramanathan\"\r\n490,Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion,https://github.com/zhenpeiyang/RelativePose,,,,Wednesday,Oral 2.1B,9:34,Wednesday,Poster 2.1,117,Zhenpei Yang,\"Zhenpei Yang, Jeffrey Z.Pan, Linjie Luo, Xiaowei Zhou, Kristen Grauman and Qixing Huang\"\r\n491,Disentangling Latent Space for VAE by Label Relevant/Irrelevant Dimensions,https://github.com/ZhilZheng/Lr-LiVAE,,,,,,,Thursday,Poster 3.2,142,Zhilin Zheng,\"Zhilin Zheng, Li Sun\"\r\n492,Large-Scale Long-Tailed Recognition in an Open World,https://github.com/zhmiao/OpenLongTailRecognition-OLTR,,https://liuziwei7.github.io/projects/LongTail.html,,Tuesday,Oral 1.2C,13:53,Tuesday,Poster 1.2,170,Ziwei Liu,\"Ziwei Liu*, Zhongqi Miao*, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu\"\r\n116,Linkage Based Face Clustering via Graph Convolution Network,https://github.com/Zhongdao/gcn_clustering,,,,,,,Tuesday,Poster 1.1,101,Zhongdao Wang,\"Zhongdao Wang, Liang Zheng, Yali Li, Shengjin Wang\"\r\n493,Unequal-Training for Deep Face Recognition With Long-Tailed Noisy Data,https://github.com/zhongyy/Unequal-Training-for-Deep-Face-Recognition-with-Long-Tailed-Noisy-Data,,,,,,,Wednesday,Poster 2.2,109,Yaoyao Zhong,\"Yaoyao Zhong, Weihong Deng,Mei Wang, Jiani Hu,Jianteng Peng, Xunqiang Tao, Yaohai Huang\"\r\n66,Detailed Human Shape Estimation From a Single Image by Hierarchical Mesh Deformation,https://github.com/zhuhao-nju/hmd,,,,Wednesday,Oral 2.1B,9:11,Wednesday,Poster 2.1,113,Hao Zhu,\"Hao Zhu, Xinxin Zuo, Sen Wang, Xun Cao, Ruigang Yang\"\r\n61,Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-Identification,https://github.com/zhunzhong07/ECN,,,,,,,Tuesday,Poster 1.1,37,Zhun Zhong,\"Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, Yi Yang\"\r\n107,Disentangled Representation Learning for 3D Face Shape,https://github.com/zihangJiang/DR-Learning-for-3D-Face,,,,,,,Thursday,Poster 3.2,118,Zi-Hang Jiang,\"Zi-Hang Jiang, Qianyi Wu, Keyu Chen, Juyong Zhang\"\r\n177,Reasoning Visual Dialogs With Structural and Partial Observations,https://github.com/zilongzheng/visdial-gnn,,,,Wednesday,Oral 2.2B,14:24,Wednesday,Poster 2.2,186,Zilong Zheng,\"Zilong Zheng, Wenguan Wang, Siyuan Qi, Song-Chun Zhu\"\r\n372,Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes,https://github.com/ziquan111/RobustPCLReconstruction,,,,,,,Thursday,Poster 3.1,106,Ziquan Lan,\"Ziquan Lan, Zi Jian YewGim Hee Lee\"\r\n12,Mask Scoring R-CNN,https://github.com/zjhuang22/maskscoring_rcnn,https://mp.weixin.qq.com/s/aP7O7AF6WoynWK_FFHkOTw,,,Wednesday,Oral 2.2A,13:35,Wednesday,Poster 2.2,26,Zhaojin Huang,\"Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang, Xinggang Wang\"\r\n494,Fast and Robust Multi-Person 3D Pose Estimation From Multiple Views,https://github.com/zju-3dv/mvpose,,,,,,,Wednesday,Poster 2.2,107,Junting Dong,\"Junting Dong, Wen Jiang, Qixing Huang, Hujun Bao, Xiaowei Zhou\"\r\n42,PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation,https://github.com/zju3dv/pvnet,,,,Wednesday,Oral 2.1B,9:52,Wednesday,Poster 2.1,120,Sida Peng,\"Sida Peng, Yuan Liu, Qixing Huang, Hujun Bao, and Xiaowei Zhou\"\r\n51,Image Super-Resolution by Neural Texture Transfer,https://github.com/ZZUTK/SRNTT,,http://web.eecs.utk.edu/~zzhang61/project_page/SRNTT/SRNTT.html,,,,,Wednesday,Poster 2.2,126,Zhifei Zhang,\"Zhifei Zhang, Zhaowen Wang, Zhe Lin, Hairong Qi\"\r\n979,TransGaGa: Geometry-Aware Unsupervised Image-To-Image Translation,https://wywu.github. io/projects/TGaGa/TGaGa.html,,,,,,,Wednesday,Poster 2.2,129,Wayne Wu,\r\n495,Parallel Optimal Transport GAN,,,,,Wednesday,Oral 2.1A,9:52,Wednesday,Poster 2.1,15,Gil Avraham,\r\n496,Learning to Explain With Complemental Examples,,,,,Thursday,Oral 3.1A,9:16,Thursday,Poster 3.1,198,Atsushi Kanehira,\"Atsushi Kanehira, Tatsuya Harada\"\r\n497,Textured Neural Avatars,,,,,Tuesday,Oral 1.2B,14:11,Tuesday,Poster 1.2,125,Aliaksandra Shysheya,\"Aliaksandra Shysheya (Samsung AI Center, Skolkovo Institute of Science and Technology), Egor Zakharov (Samsung AI Center, Skolkovo Institute of Science and Technology), Kara-Ali Aliev (Samsung AI Center), Renat Bashirov (Samsung AI Center), Egor Burkov (Samsung AI Center, Skolkovo Institute of Science and Technology), Karim Iskakov (Samsung AI Center), Aleksei Ivakhnenko (Samsung AI Center), Yury Malkov (Samsung AI Center), Igor Pasechnik (Samsung AI Center), Dmitry Ulyanov (Samsung AI Center, Skolkovo Institute of Science and Technology), Alexander Vakhitov (Samsung AI Center, Skolkovo Institute of Science and Technology), Victor Lempitsky (Samsung AI Center, Skolkovo Institute of Science and Technology)\"\r\n499,Self-Supervised Adaptation of High-Fidelity Face Models for Monocular Performance Tracking,,,,,Wednesday,Oral 2.1C,8:48,Wednesday,Poster 2.1,138,Jae Shin Yoon,\r\n501,IM-Net for High Resolution Video Frame Interpolation,,,,,Tuesday,Oral 1.2B,14:16,Tuesday,Poster 1.2,126,Tomer Peleg,\r\n503,Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in a Triadic Interaction,,,,,Thursday,Oral 3.2B,14:11,Thursday,Poster 3.2,99,Hanbyul Joo,\r\n505,Cycle-Consistency for Robust Visual Question Answering,,,,,Wednesday,Oral 2.2B,14:11,Wednesday,Poster 2.2,184,Meet Shah,\r\n506,Efficient Video Classification Using Fewer Frames,,,,,Tuesday,Oral 1.1C,10:04,Tuesday,Poster 1.1,120,Shweta Bhardwaj,\r\n507,Diverse Generation for Multi-Agent Sports Games,,,,,Wednesday,Oral 2.1C,8:53,Wednesday,Poster 2.1,139,Raymond A. Yeh,\r\n508,Relational Action Forecasting,,,,,Tuesday,Oral 1.1C,9:18,Tuesday,Poster 1.1,112,Chen Sun,\r\n509,Barrage of Random Transforms for Adversarially Robust Defense,,,,,Wednesday,Oral 2.2A,14:47,Wednesday,Poster 2.2,38,Edward Raff,\r\n511,Unified Visual-Semantic Embeddings: Bridging Vision and Language With Structured Meaning Representations,,,,,Wednesday,Oral 2.2B,13:48,Wednesday,Poster 2.2,180,Hao Wu,\r\n514,GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching,,,,,Wednesday,Oral 2.1C,9:06,Wednesday,Poster 2.1,141,Simone Melzi,\r\n515,Learning Structure-And-Motion-Aware Rolling Shutter Correction,,,,,Wednesday,Oral 2.1B,9:47,Wednesday,Poster 2.1,119,Bingbing Zhuang,\r\n516,Do Better ImageNet Models Transfer Better?,,,,,Tuesday,Oral 1.2C,15:10,Tuesday,Poster 1.2,183,Simon Kornblith,\r\n518,Streamlined Dense Video Captioning,,,,,Wednesday,Oral 2.2B,13:35,Wednesday,Poster 2.2,178,Jonghwan Mun,\r\n520,Enhancing Diversity of Defocus Blur Detectors via Cross-Ensemble Network,,,,,Thursday,Oral 3.1C,9:24,Thursday,Poster 3.1,64,Wenda Zhao,\r\n521,Neural Illumination: Lighting Prediction for Indoor Environments,,,,,Wednesday,Oral 2.2C,15:05,Wednesday,Poster 2.2,146,Shuran Song,\r\n522,Shapes and Context: In-The-Wild Image Synthesis & Manipulation,,,,,Tuesday,Oral 1.2B,13:30,Tuesday,Poster 1.2,118,Aayush Bansal,\r\n523,\"Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech\",,,,,Thursday,Oral 3.2A,14:11,Thursday,Poster 3.2,8,Aditya Deshpande,\r\n524,\"Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model\",,,,,Thursday,Oral 3.1B,8:30,Thursday,Poster 3.1,77,Tian Han,\r\n525,Learning to Synthesize Motion Blur,,,http://timothybrooks.com/tech/motion-blur/,,Wednesday,Oral 2.2C,14:16,Wednesday,Poster 2.2,138,Tim Brooks,\"Tim Brooks, Jonathan T. Barron\"\r\n527,Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search,,,,,Thursday,Oral 3.1B,9:29,Thursday,Poster 3.1,87,Abhimanyu Dubey,\r\n528,Gaussian Temporal Awareness Networks for Action Localization,,,,,Tuesday,Oral 1.1C,9:59,Tuesday,Poster 1.1,119,Fuchen Long,\r\n529,Photon-Flooded Single-Photon 3D Cameras,,,,,Wednesday,Oral 2.2C,13:30,Wednesday,Poster 2.2,130,Anant Gupta,\r\n530,NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences,,,,,Tuesday,Oral 1.1B,9:54,Tuesday,Poster 1.1,81,Chen Zhao,\r\n531,End-To-End Interpretable Neural Motion Planner,,,,,Thursday,Oral 3.1A,9:52,Thursday,Poster 3.1,204,Wenyuan Zeng,\r\n532,Learning Spatial Common Sense With Geometry-Aware Recurrent Networks,,,,,Tuesday,Oral 1.2C,14:29,Tuesday,Poster 1.2,176,Hsiao-Yu Fish Tung,\r\n533,Deep Reinforcement Learning of Volume-Guided Progressive View Inpainting for 3D Point Scene Completion From a Single Depth Image,,,,,Tuesday,Oral 1.1B,10:04,Tuesday,Poster 1.1,83,Xiaoguang Han,\r\n534,H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions,,,,,Wednesday,Oral 2.1B,9:24,Wednesday,Poster 2.1,115,Bugra Tekin,\r\n535,d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding,,,,,Tuesday,Oral 1.2C,13:30,Tuesday,Poster 1.2,166,Xiang Xu,\r\n536,A Sufficient Condition for Convergences of Adam and RMSProp,,,,,Thursday,Oral 3.2C,14:52,Thursday,Poster 3.2,161,Fangyu Zou,\r\n537,Gait Recognition via Disentangled Representation Learning,,,,,Wednesday,Oral 2.1C,9:52,Wednesday,Poster 2.1,149,Ziyuan Zhang,\r\n538,Constrained Generative Adversarial Networks for Interactive Image Generation,,,,,Thursday,Oral 3.2A,14:47,Thursday,Poster 3.2,14,Eric Heim,\r\n539,Unprocessing Images for Learned Raw Denoising,,,http://timothybrooks.com/tech/unprocessing/,,Thursday,Oral 3.2C,13:58,Thursday,Poster 3.2,152,Tim Brooks,\"Tim Brooks, Ben Mildenhall, Tianfan Xue, Jiawen Chen, Dillon Sharlet, Jonathan T. Barron\"\r\n540,Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields,,,,,Wednesday,Oral 2.1C,8:58,Wednesday,Poster 2.1,140,Yaadhav Raaj,\r\n541,Cross-Modality Personalization for Retrieval,,,,,Wednesday,Oral 2.2A,13:48,Wednesday,Poster 2.2,28,Nils Murrugarra-Llerena,\r\n542,FML: Face Model Learning From Videos,,https://mp.weixin.qq.com/s/wxZXVtzgxzBKqWOyt9Godw,http://gvv.mpi-inf.mpg.de/projects/FML19/,,Thursday,Oral 3.2B,13:35,Thursday,Poster 3.2,93,Ayush Tewari,A. Tewari F. Bernard P. Garrido G. Bharaj M. Elgharib H-P. Seidel P. Perez M. Zollh?fer C.Theobalt\r\n543,Learning Joint Gait Representation via Quintuplet Loss Minimization,,,,,Wednesday,Oral 2.1C,9:47,Wednesday,Poster 2.1,148,Kaihao Zhang,\r\n544,Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation,,,,,Thursday,Oral 3.2B,14:24,Thursday,Poster 3.2,101,Xipeng Chen,\r\n545,DeepVoxels: Learning Persistent 3D Feature Embeddings,,,,,Tuesday,Oral 1.2B,14:42,Tuesday,Poster 1.2,130,Vincent Sitzmann,\r\n546,Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification,,,,,Wednesday,Oral 2.2A,14:29,Wednesday,Poster 2.2,35,Si Wu,\r\n547,Causes and Corrections for Bimodal Multi-Path Scanning With Structured Light,,,,,Wednesday,Oral 2.1B,8:35,Wednesday,Poster 2.1,107,Yu Zhang,\r\n548,Neural RGB(r)D Sensing: Depth and Uncertainty From a Video Camera,,,,,Thursday,Oral 3.2C,13:30,Thursday,Poster 3.2,147,Chao Liu,\r\n549,Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation,,,,,Tuesday,Oral 1.2C,14:47,Tuesday,Poster 1.2,179,Po-Yi Chen,\r\n550,Slim DensePose: Thrifty Learning From Sparse Annotations and Motion Cues,,,,,Thursday,Oral 3.2B,14:34,Thursday,Poster 3.2,103,Natalia Neverova,\r\n551,Inverse Procedural Modeling of Knitwear,,,,,Thursday,Oral 3.1A,9:34,Thursday,Poster 3.1,201,Elena Trunz,\r\n552,Online High Rank Matrix Completion,,,,,Thursday,Oral 3.1B,8:40,Thursday,Poster 3.1,79,Jicong Fan,\r\n553,Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation,,https://mp.weixin.qq.com/s/LsHWkdwqqrOPFgCNNcBdpg,,,Wednesday,Oral 2.2B,13:58,Wednesday,Poster 2.2,182,Xin Wang,\"Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang\"\r\n554,SDC - Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks,,,,,Tuesday,Oral 1.2C,14:06,Tuesday,Poster 1.2,172,Rene Schuster,\r\n555,Guaranteed Matrix Completion Under Multiple Linear Transformations,,,,,Thursday,Oral 3.2C,15:00,Thursday,Poster 3.2,162,Chao Li,\r\n556,AutoAugment: Learning Augmentation Strategies From Data,,,,,Tuesday,Oral 1.1A,10:04,Tuesday,Poster 1.1,12,Ekin D. Cubuk,\r\n557,DeepView: View Synthesis With Learned Gradient Descent,,,,,Tuesday,Oral 1.2B,13:58,Tuesday,Poster 1.2,123,John Flynn,\r\n558,C2AE: Class Conditioned Auto-Encoder for Open-Set Recognition,,,,,Tuesday,Oral 1.2A,15:10,Tuesday,Poster 1.2,35,Poojan Oza,\r\n559,Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection,,,,,Wednesday,Oral 2.2A,13:40,Wednesday,Poster 2.2,27,Hang Xu,\r\n560,A Theory of Fermat Paths for Non-Line-Of-Sight Shape Reconstruction,,,,,Wednesday,Oral 2.2C,13:53,Wednesday,Poster 2.2,134,Shumian Xin,\r\n561,Combining 3D Morphable Models: A Large Scale Face-And-Head Model,,,,,Thursday,Oral 3.2B,14:47,Thursday,Poster 3.2,105,Stylianos Ploumpis,\r\n562,Lifting Vectorial Variational Problems: A Natural Formulation Based on Geometric Measure Theory and Discrete Exterior Calculus,,,,,Thursday,Oral 3.2C,14:47,Thursday,Poster 3.2,160,Thomas Mollenhoff,\r\n564,Deep Sky Modeling for Single Image Outdoor Lighting Estimation,,,,,Wednesday,Oral 2.2C,15:10,Wednesday,Poster 2.2,147,Yannick Hold-Geoffroy,\r\n565,Transferrable Prototypical Networks for Unsupervised Domain Adaptation,,,,,Tuesday,Oral 1.2A,14:29,Tuesday,Poster 1.2,28,Yingwei Pan,\r\n566,Learning the Depths of Moving People by Watching Frozen People,,,,,Wednesday,Oral 2.1B,9:29,Wednesday,Poster 2.1,116,Zhengqi Li,\r\n567,Blending-Target Domain Adaptation by Adversarial Meta-Adaptation Networks,,,,,Tuesday,Oral 1.2A,14:34,Tuesday,Poster 1.2,29,Ziliang Chen,\r\n568,Semantic Correlation Promoted Shape-Variant Context for Segmentation,,,,,Thursday,Oral 3.1C,9:11,Thursday,Poster 3.1,62,Henghui Ding,\r\n569,Pushing the Boundaries of View Extrapolation With Multiplane Images,,,,,Tuesday,Oral 1.1B,9:28,Tuesday,Poster 1.1,77,Pratul P. Srinivasan,\r\n572,Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning From Radiology Reports and Label Ontology,,,,,Thursday,Oral 3.1A,8:30,Thursday,Poster 3.1,190,Ke Yan,\r\n573,\"Point in, Box Out: Beyond Counting Persons in Crowds\",,,,,Wednesday,Oral 2.2A,14:11,Wednesday,Poster 2.2,32,Yuting Liu,\r\n574,Striking the Right Balance With Uncertainty,,,,,Tuesday,Oral 1.1A,9:59,Tuesday,Poster 1.1,11,Salman Khan,\r\n575,Strand-Accurate Multi-View Hair Capture,,,,,Tuesday,Oral 1.1B,9:18,Tuesday,Poster 1.1,75,Giljoo Nam,\r\n576,In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations,,,,,Thursday,Oral 3.2B,14:29,Thursday,Poster 3.2,102,Ikhsanul Habibie,\r\n577,Blind Visual Motif Removal From a Single Image,,,,,Wednesday,Oral 2.2C,14:29,Wednesday,Poster 2.2,140,Amir Hertz,\r\n579,Boosting Local Shape Matching for Dense 3D Face Correspondence,,,,,Thursday,Oral 3.2B,14:52,Thursday,Poster 3.2,106,Zhenfeng Fan,\r\n580,A Convex Relaxation for Multi-Graph Matching,,,,,Thursday,Oral 3.2C,15:10,Thursday,Poster 3.2,164,Paul Swoboda,\r\n581,Learning Correspondence From the Cycle-Consistency of Time,,,https://ajabri.github.io/timecycle/,,Tuesday,Oral 1.2C,14:11,Tuesday,Poster 1.2,173,Xiaolong Wang,\"Xiaolong Wang, Allan Jabri, Alexei A. Efros\"\r\n582,Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes,,,,,Tuesday,Oral 1.1C,9:28,Tuesday,Poster 1.1,114,Yuke Li,\r\n583,What Correspondences Reveal About Unknown Camera and Motion Models?,,,,,Thursday,Oral 3.1B,9:06,Thursday,Poster 3.1,83,Thomas Probst,\r\n584,Long-Term Feature Banks for Detailed Video Understanding,,,,,Tuesday,Oral 1.1C,9:23,Tuesday,Poster 1.1,113,Chao-Yuan Wu,\r\n586,GeoNet: Deep Geodesic Networks for Point Cloud Analysis,,https://mp.weixin.qq.com/s/2dqw6__2k51B4SSkLYRTfw?,,,Wednesday,Oral 2.2C,14:47,Wednesday,Poster 2.2,143,Tong He,\"Tong He, Haibin Huang, Li Yi, Yuqian Zhou, Chihao Wu, Jue Wang, Stefano Soatto\"\r\n587,Coordinate-Free Carlsson-Weinshall Duality and Relative Multi-View Geometry,,,,,Tuesday,Oral 1.1B,9:59,Tuesday,Poster 1.1,82,Matthew Trager,\r\n588,Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks,,,,,Wednesday,Oral 2.1A,8:53,Wednesday,Poster 2.1,5,Yinpeng Dong,\r\n589,Robust Subspace Clustering With Independent and Piecewise Identically Distributed Noise Modeling,,,,,Thursday,Oral 3.1B,8:58,Thursday,Poster 3.1,82,Yuanman Li,\r\n590,HoloPose: Holistic 3D Human Reconstruction In-The-Wild,,,,,Thursday,Oral 3.2B,14:16,Thursday,Poster 3.2,100,Riza Alp Guler,\r\n591,CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification,,https://mp.weixin.qq.com/s/68OW_7EIkzCF4HnBVcz6qg,,,Thursday,Oral 3.1B,9:47,Thursday,Poster 3.1,90,Zheng Tang,\"Zheng Tang, Milind Naphade, Ming-Yu Liu, Xiaodong Yang, Stan Birchfield, Shuo Wang, Ratnesh Kumar, David Anastasiu, Jenq-Neng Hwang\"\r\n592,Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration,,,,,Thursday,Oral 3.1A,8:53,Thursday,Poster 3.1,194,De-An Huang,\r\n593,On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions,,,,,Tuesday,Oral 1.1A,9:28,Tuesday,Poster 1.1,6,Yusuke Tsuzuku,\r\n595,Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking,,,,,Wednesday,Oral 2.1C,9:11,Wednesday,Poster 2.1,142,Andrii Maksai,\r\n596,Balanced Self-Paced Learning for Generative Adversarial Clustering Network,,,,,Wednesday,Oral 2.1A,9:42,Wednesday,Poster 2.1,13,Kamran Ghasedi,\r\n597,LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation,,,,,Tuesday,Oral 1.1B,9:46,Tuesday,Poster 1.1,80,Sunok Kim,\r\n598,AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations,,,,,Thursday,Oral 3.2B,13:40,Thursday,Poster 3.2,94,Xiao Zhang,\r\n599,Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection,,,,,Wednesday,Oral 2.1A,9:24,Wednesday,Poster 2.1,10,Ruoqi Sun,\r\n600,STEP: Spatio-Temporal Progressive Learning for Video Action Detection,,,,,Tuesday,Oral 1.1C,9:10,Tuesday,Poster 1.1,111,Xitong Yang,\r\n601,Tangent-Normal Adversarial Regularization for Semi-Supervised Learning,,,,,Thursday,Oral 3.2A,13:58,Thursday,Poster 3.2,6,Bing Yu,\r\n602,Underexposed Photo Enhancement Using Deep Illumination Estimation,,,,,Wednesday,Oral 2.2C,14:24,Wednesday,Poster 2.2,139,Ruixing Wang,\r\n603,Deep Tree Learning for Zero-Shot Face Anti-Spoofing,,,,,Wednesday,Oral 2.1C,9:34,Wednesday,Poster 2.1,146,Yaojie Liu,\r\n604,Multimodal Explanations by Predicting Counterfactuality in Videos,,,,,Thursday,Oral 3.1A,9:11,Thursday,Poster 3.1,197,Atsushi Kanehira,\r\n605,The Visual Centrifuge: Model-Free Layered Video Representations,,,,,Tuesday,Oral 1.2B,14:52,Tuesday,Poster 1.2,132,Jean-Baptiste Alayrac,\r\n606,AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs,,,,,Wednesday,Oral 2.2A,15:10,Wednesday,Poster 2.2,42,Massimiliano Mancini,\r\n607,MeshAdv: Adversarial Meshes for Visual Recognition,,,,,Wednesday,Oral 2.2C,14:52,Wednesday,Poster 2.2,144,Chaowei Xiao,\r\n608,Incremental Object Learning From Contiguous Views,,,,,Thursday,Oral 3.1B,9:34,Thursday,Poster 3.1,88,Stefan Stojanov,\r\n609,Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach,,,,,Tuesday,Oral 1.2C,14:24,Tuesday,Poster 1.2,175,Proteek Chandan Roy,\r\n610,Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss,,,,,Wednesday,Oral 2.1A,9:16,Wednesday,Poster 2.1,9,Sangil Jung,\r\n611,Estimating 3D Motion and Forces of Person-Object Interactions From Monocular Video,,,,,Thursday,Oral 3.1A,9:42,Thursday,Poster 3.1,202,Zongmian Li,\r\n612,Robust Histopathology Image Analysis: To Label or to Synthesize?,,,,,Thursday,Oral 3.1A,8:35,Thursday,Poster 3.1,191,Le Hou,\r\n613,BAD SLAM: Bundle Adjusted Direct RGB-D SLAM,,,,,Tuesday,Oral 1.1B,9:05,Tuesday,Poster 1.1,73,Thomas Schops,\r\n614,Learning Personalized Modular Network Guided by Structured Knowledge,,,,,Thursday,Oral 3.1C,9:47,Thursday,Poster 3.1,68,Xiaodan Liang,\r\n615,Embodied Question Answering in Photorealistic Environments With Point Cloud Perception,,,,,Wednesday,Oral 2.2B,14:16,Wednesday,Poster 2.2,185,Erik Wijmans,\r\n616,What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessmen,,,,,Tuesday,Oral 1.1C,9:36,Tuesday,Poster 1.1,115,Paritosh Parmar,\r\n618,ContextDesc: Local Descriptor Augmentation With Cross-Modality Context,,,,,Tuesday,Oral 1.2C,13:48,Tuesday,Poster 1.2,169,Zixin Luo,\r\n619,Adaptive NMS: Refining Pedestrian Detection in a Crowd,,,,,Wednesday,Oral 2.2A,14:06,Wednesday,Poster 2.2,31,Songtao Liu,\r\n620,Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth,,,,,Thursday,Oral 3.1C,8:40,Thursday,Poster 3.1,57,Davy Neven,\r\n621,Inverse Path Tracing for Joint Material and Lighting Estimation,,,,,Tuesday,Oral 1.2B,14:47,Tuesday,Poster 1.2,131,Dejan Azinovic,\r\n622,Adversarial Attacks Beyond the Image Space,,,,,Wednesday,Oral 2.1A,8:48,Wednesday,Poster 2.1,4,Xiaohui Zeng,\r\n623,Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model,,,,,Tuesday,Oral 1.1C,9:54,Tuesday,Poster 1.1,118,Weining Wang,\r\n624,High-Quality Face Capture Using Anatomical Muscles,,,,,Thursday,Oral 3.2B,13:30,Thursday,Poster 3.2,92,Michael Bao,\r\n625,APDrawingGAN: Generating Artistic Portrait Drawings From Face Photos With Hierarchical GANs,,,,,Thursday,Oral 3.2A,14:42,Thursday,Poster 3.2,13,Ran Yi,\r\n626,GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering,,,,,Wednesday,Oral 2.2B,14:42,Wednesday,Poster 2.2,189,Drew A. Hudson,\r\n627,Panoptic Feature Pyramid Networks,,,,,Wednesday,Oral 2.2A,13:30,Wednesday,Poster 2.2,25,Alexander Kirillov,\r\n628,Image Deformation Meta-Networks for One-Shot Learning,,,,,Thursday,Oral 3.1B,8:35,Thursday,Poster 3.1,78,Zitian Chen,\r\n629,Learning Loss for Active Learning,,,,,Tuesday,Oral 1.1A,9:54,Tuesday,Poster 1.1,10,Donggeun Yoo,\r\n630,Graph Convolutional Tracking,,,,,Wednesday,Oral 2.1C,9:16,Wednesday,Poster 2.1,143,Junyu Gao,\r\n631,Tell Me Where I Am: Object-Level Scene Context Prediction,,,,,Tuesday,Oral 1.2C,14:52,Tuesday,Poster 1.2,180,Xiaotian Qiao,\r\n632,Max-Sliced Wasserstein Distance and Its Use for GANs,,,,,Thursday,Oral 3.2A,13:40,Thursday,Poster 3.2,3,Ishan Deshpande,\r\n633,Composing Text and Image for Image Retrieval - an Empirical Odyssey,,,,,Wednesday,Oral 2.2A,13:53,Wednesday,Poster 2.2,29,Nam Vo,\r\n634,Residual Networks for Light Field Image Super-Resolution,,,,,Thursday,Oral 3.2C,14:06,Thursday,Poster 3.2,153,Shuo Zhang,\r\n635,Explainability Methods for Graph Convolutional Neural Networks,,,,,Thursday,Oral 3.2A,15:00,Thursday,Poster 3.2,16,Phillip E. Pope,\r\n637,SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates,,,,,Thursday,Oral 3.2A,15:10,Thursday,Poster 3.2,18,Martin Engilberge,\r\n638,\"\"\"Double-DIP\"\": Unsupervised Image Decomposition via Coupled Deep-Image-Priors\",,,,,Thursday,Oral 3.2C,13:53,Thursday,Poster 3.2,151,Yosef Gandelsman,\r\n639,Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval,,,,,Tuesday,Oral 1.2A,13:53,Tuesday,Poster 1.2,22,Sounak Dey,\r\n640,Cascaded Projection: End-To-End Network Compression and Acceleration,,,,,Thursday,Oral 3.2A,14:24,Thursday,Poster 3.2,10,Breton Minnehan,\r\n641,High Flux Passive Imaging With Single-Photon Sensors,,,,,Wednesday,Oral 2.2C,13:35,Wednesday,Poster 2.2,131,Atul Ingle,\r\n642,A Generative Adversarial Density Estimator,,,,,Thursday,Oral 3.2A,15:05,Thursday,Poster 3.2,17,M. Ehsan Abbasnejad,\r\n643,Sampling Techniques for Large-Scale Object Detection From Sparsely Annotated Objects,,,,,Wednesday,Oral 2.2A,14:34,Wednesday,Poster 2.2,36,Yusuke Niitani,\r\n644,Non-Local Meets Global: An Integrated Paradigm for Hyperspectral Denoising,,,,,Wednesday,Oral 2.2C,14:34,Wednesday,Poster 2.2,141,Wei He,\r\n645,Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering,,,,,Wednesday,Oral 2.2B,14:06,Wednesday,Poster 2.2,183,Peng Gao,\r\n646,Unsupervised Part-Based Disentangling of Object Shape and Appearance,,,,,Thursday,Oral 3.2B,15:00,Thursday,Poster 3.2,107,Dominik Lorenz,\r\n647,Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation,,,,,Wednesday,Oral 2.2A,13:58,Wednesday,Poster 2.2,30,Xiaobing Wang,\r\n648,CollaGAN: Collaborative GAN for Missing Image Data Imputation,,,,,Tuesday,Oral 1.2B,15:10,Tuesday,Poster 1.2,135,Dongwook Lee,\r\n649,Semantics Disentangling for Text-To-Image Generation,,,,,Tuesday,Oral 1.2B,13:35,Tuesday,Poster 1.2,119,Guojun Yin,\r\n650,Two Body Problem: Collaborative Visual Task Completion,,https://prior.allenai.org/projects/two-body-problem,,,Wednesday,Oral 2.2B,14:34,Wednesday,Poster 2.2,188,Unnat Jain,\"Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi\"\r\n651,3D Shape Reconstruction From Images in the Frequency Domain,,,,,Wednesday,Oral 2.1B,8:58,Wednesday,Poster 2.1,111,Weichao Shen,\r\n652,Fast Spatially-Varying Indoor Lighting Estimation,,,,,Wednesday,Oral 2.2C,15:00,Wednesday,Poster 2.2,145,Mathieu Garon,\r\n653,Steady-State Non-Line-Of-Sight Imaging,,,,,Wednesday,Oral 2.2C,13:48,Wednesday,Poster 2.2,133,Wenzheng Chen,\r\n654,SFNet: Learning Object-Aware Semantic Correspondence,,,,,Tuesday,Oral 1.2A,14:52,Tuesday,Poster 1.2,32,Junghyup Lee,\r\n655,Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping,,,,,Tuesday,Oral 1.2B,14:34,Tuesday,Poster 1.2,129,Huan Fu,\r\n656,Gradient Matching Generative Networks for Zero-Shot Learning,,,,,Tuesday,Oral 1.2A,13:48,Tuesday,Poster 1.2,21,Mert Bulent Sariyildiz,\r\n657,Finding Task-Relevant Features for Few-Shot Learning by Category Traversal,,,,,Tuesday,Oral 1.1A,9:00,Tuesday,Poster 1.1,1,Hongyang Li,\r\n658,Look Back and Predict Forward in Image Captioning,,,,,,,,Wednesday,Poster 2.2,201,Yu Qin,\r\n659,Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors,,,,,,,,Thursday,Poster 3.1,26,Mohammad Mahdi Derakhshani,\r\n660,Group Sampling for Scale Invariant Face Detection,,,,,,,,Tuesday,Poster 1.2,97,Xiang Ming,\r\n661,Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN,,,,,,,,Tuesday,Poster 1.1,89,Shiyi Lan,\r\n662,Domain-Specific Batch Normalization for Unsupervised Domain Adaptation,,,,,,,,Wednesday,Poster 2.2,62,Woong-Gi Chang,\r\n663,Radial Distortion Triangulation,,,,,,,,Thursday,Poster 3.1,105,Zuzana Kukelova,\r\n664,Polarimetric Camera Calibration Using an LCD Monitor,,,,,,,,Tuesday,Poster 1.2,145,Zhixiang Wang,\r\n665,Towards High-Fidelity Nonlinear 3D Face Morphable Model,,,http://cvlab.cse.msu.edu/project-nonlinear-3dmm.html?,,,,,Tuesday,Poster 1.1,102,Luan Tran,\"Luan Tran, Feng Liu, Xiaoming Liu\"\r\n666,Towards Instance-Level Image-To-Image Translation,,,,,,,,Tuesday,Poster 1.2,139,Zhiqiang Shen,\r\n667,Structure-Preserving Stereoscopic View Synthesis With Multi-Scale Adversarial Correlation Matching,,,,,,,,Wednesday,Poster 2.1,162,Yu Zhang,\r\n668,Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks,,,,,,,,Wednesday,Poster 2.2,59,N. Dinesh Reddy,\r\n669,Co-Occurrence Neural Network,,,,,,,,Wednesday,Poster 2.1,24,Irina Shevlev,\r\n670,Learning to Film From Professional Human Motion Videos,,,,,,,,Tuesday,Poster 1.2,214,Chong Huang,\r\n671,Stochastic Class-Based Hard Example Mining for Deep Metric Learning,,,,,,,,Wednesday,Poster 2.2,51,Yumin Suh,\r\n672,Modularized Textual Grounding for Counterfactual Resilience,,,,,,,,Wednesday,Poster 2.1,215,Zhiyuan Fang,\r\n673,Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis,,,,,,,,Thursday,Poster 3.2,116,Yu Yu,\r\n674,Global Second-Order Pooling Convolutional Networks,,,,,,,,Tuesday,Poster 1.2,55,Zilin Gao,\r\n675,Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation,,,,,,,,Tuesday,Poster 1.1,18,Bohan Zhuang,\r\n676,Revisiting Perspective Information for Efficient Crowd Counting,,,,,,,,Wednesday,Poster 2.2,54,Miaojing Shi,\r\n677,Spectral Reconstruction From Dispersive Blur: A Novel Light Efficient Spectral Imager,,,,,,,,Thursday,Poster 3.2,143,Yuanyuan Zhao,\r\n678,Joint Face Detection and Facial Motion Retargeting for Multiple Faces,,,,,,,,Thursday,Poster 3.1,109,Bindita Chaudhuri,\r\n680,Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras,,,,,,,,Thursday,Poster 3.2,80,Thiago L. T. da Silveira,\r\n681,R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network,,,,,,,,Thursday,Poster 3.2,51,Bin Zhu,\r\n682,Predicting Visible Image Differences Under Varying Display Brightness and Viewing Distance,,,,,,,,Wednesday,Poster 2.1,89,Nanyang Ye,\r\n683,Local Detection of Stereo Occlusion Boundaries,,,,,,,,Tuesday,Poster 1.2,153,Jialiang Wang,\r\n684,Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification,,,,,,,,Wednesday,Poster 2.1,36,Yiru Zhao,\r\n685,NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection,,,,,,,,Wednesday,Poster 2.2,11,Golnaz Ghiasi,\r\n686,Local Relationship Learning With Person-Specific Shape Regularization for Facial Action Unit Detection,,,,,,,,Thursday,Poster 3.2,114,Xuesong Niu,\r\n687,SimulCap : Single-View Human Performance Capture With Cloth Simulation,,,,,,,,Wednesday,Poster 2.1,96,Tao Yu,\r\n688,Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture,,,,,,,,Thursday,Poster 3.2,139,Ning Yu,\r\n689,Convolutional Relational Machine for Group Activity Recognition,,,,,,,,Wednesday,Poster 2.2,117,Sina Mokhtarzadeh Azar,\r\n690,Privacy Protection in Street-View Panoramas Using Depth and Multi-View Imagery,,,,,,,,Thursday,Poster 3.1,212,Ries Uittenbogaard,\r\n691,PMS-Net: Robust Haze Removal Based on Patch Map for Single Images,,,,,,,,Thursday,Poster 3.2,72,Wei-Ting Chen,\r\n692,CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency,,,,,,,,Tuesday,Poster 1.1,181,Yun-Chun Chen,\r\n693,Focus Is All You Need: Loss Functions for Event-Based Vision,,,,,,,,Thursday,Poster 3.2,169,Guillermo Gallego,\r\n694,TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions,,,,,,,,Wednesday,Poster 2.2,213,Rohan Chandra,\r\n695,Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks,,,,,,,,Tuesday,Poster 1.1,198,Peng Wang,\r\n697,Grounding Human-To-Vehicle Advice for Self-Driving Vehicles,,,,,,,,Thursday,Poster 3.1,213,Jinkyu Kim,\r\n698,Divergence Prior and Vessel-Tree Reconstruction,,,,,,,,Thursday,Poster 3.1,160,Zhongwen Zhang,\r\n699,Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions,,,,,,,,Wednesday,Poster 2.2,210,Joey Hong,\r\n700,TraVeLGAN: Image-To-Image Translation by Transformation Vector Learning,,,,,,,,Thursday,Poster 3.1,3,Matthew Amodio,\r\n701,IRLAS: Inverse Reinforcement Learning for Architecture Search,,,,,,,,Thursday,Poster 3.1,7,Minghao Guo,\r\n702,Scene Categorization From Contours: Medial Axis Based Salience Measures,,,,,,,,Tuesday,Poster 1.2,201,Morteza Rezanejad,\r\n703,Single Image Depth Estimation Trained via Depth From Defocus Cues,,,,,,,,Wednesday,Poster 2.2,96,Shir Gur,\r\n704,MBS: Macroblock Scaling for CNN Model Reduction,,,,,,,,Thursday,Poster 3.1,17,Yu-Hsun Lin,\r\n705,Deep Asymmetric Metric Learning via Rich Relationship Mining,,,,,,,,Tuesday,Poster 1.2,197,Xinyi Xu,\r\n707,Engaging Image Captioning via Personality,,,,,,,,Thursday,Poster 3.2,193,Kurt Shuster,\r\n708,Quasi-Unsupervised Color Constancy,,,,,,,,Thursday,Poster 3.2,144,Simone Bianco,\r\n709,Typography With Decor: Intelligent Text Style Transfer,,,,,,,,Wednesday,Poster 2.1,165,Wenjing Wang,\r\n710,Convolutional Neural Networks Can Be Deceived by Visual Illusions,,,,,,,,Thursday,Poster 3.2,172,Alexander Gomez-Villa,\r\n711,Label Propagation for Deep Semi-Supervised Learning,,,,,,,,Wednesday,Poster 2.1,52,Ahmet Iscen,\r\n712,Recurrent Neural Networks With Intra-Frame Iterations for Video Deblurring,,,,,,,,Wednesday,Poster 2.2,156,Seungjun Nah,\r\n713,Deep Flow-Guided Video Inpainting,,,,,,,,Tuesday,Poster 1.2,143,Rui Xu,\r\n714,End-To-End Time-Lapse Video Synthesis From a Single Outdoor Image,,,,,,,,Tuesday,Poster 1.1,142,Seonghyeon Nam,\r\n715,Face Parsing With RoI Tanh-Warping,,,,,,,,Wednesday,Poster 2.1,126,Jinpeng Lin,\r\n716,Dichromatic Model Based Temporal Color Constancy for AC Light Sources,,,,,,,,Thursday,Poster 3.2,174,Jun-Sang Yoo,\r\n717,Snapshot Distillation: Teacher-Student Optimization in One Generation,,,,,,,,Tuesday,Poster 1.2,38,Chenglin Yang,\r\n718,Large Scale Incremental Learning,,,,,,,,Tuesday,Poster 1.1,14,Yue Wu,\r\n719,Variational Autoencoders Pursue PCA Directions (by Accident),,,,,,,,Thursday,Poster 3.2,182,Michal Rolinek,\r\n720,SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360deg Images,,,,,,,,Thursday,Poster 3.1,24,Yeonkun Lee,\r\n721,Learning With Batch-Wise Optimal Transport Loss for 3D Shape Recognition,,,,,,,,Tuesday,Poster 1.2,86,Lin Xu,\r\n722,EV-Gait: Event-Based Robust Gait Recognition Using Dynamic Vision Sensors,,,,,,,,Wednesday,Poster 2.1,213,Yanxiang Wang,\r\n723,Reducing Uncertainty in Undersampled MRI Reconstruction With Active Acquisition,,,,,,,,Tuesday,Poster 1.1,207,Zizhao Zhang,\r\n724,Cross-Modal Self-Attention Network for Referring Image Segmentation,,,,,,,,Thursday,Poster 3.1,189,Linwei Ye,\r\n725,Residual Regression With Semantic Prior for Crowd Counting,,,,,,,,Tuesday,Poster 1.2,193,Jia Wan,\r\n726,LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving,,,,,,,,Thursday,Poster 3.2,209,Gregory P. Meyer,\r\n727,VITAMIN-E: VIsual Tracking and MappINg With Extremely Dense Feature Points,,,,,,,,Thursday,Poster 3.1,101,Masashi Yokozuka,\r\n728,Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning,,,,,,,,Thursday,Poster 3.2,79,Mahdi S. Hosseini,\r\n729,ROI Pooled Correlation Filters for Visual Tracking,,,,,,,,Wednesday,Poster 2.1,154,Yuxuan Sun,\r\n730,Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation,,,,,,,,Tuesday,Poster 1.2,3,Chunlei Liu,\r\n731,Attention-Guided Unified Network for Panoptic Segmentation,,https://mp.weixin.qq.com/s/1tohID6SM3weS476XU5okw?,,,,,,Wednesday,Poster 2.2,10,Yanwei Li,\"Yanwei Li, Xinze Chen, Zheng Zhu, Lingxi Xie, Guan Huang, Dalong Du, Xingang Wang\"\r\n732,Intention Oriented Image Captions With Guiding Objects,,,,,,,,Wednesday,Poster 2.2,204,Yue Zheng,\r\n733,Weakly Supervised Video Moment Retrieval From Text Queries,,,,,,,,Thursday,Poster 3.2,63,Niluthpol Chowdhury Mithun,\r\n734,Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features,,,,,,,,Thursday,Poster 3.1,44,Vlad Hosu,\r\n735,AANet: Attribute Attention Network for Person Re-Identifications,,,,,,,,Wednesday,Poster 2.2,21,Chiat-Pin Tay,\r\n736,Feature-Level Frankenstein: Eliminating Variations for Discriminative Recognition,,,,,,,,Tuesday,Poster 1.1,41,Xiaofeng Liu,\r\n737,Interaction-And-Aggregation Network for Person Re-Identification,,,,,,,,Thursday,Poster 3.1,38,Ruibing Hou,\r\n738,Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach,,,,,,,,Tuesday,Poster 1.2,90,Amir Atapour-Abarghouei,\r\n739,Photo Wake-Up: 3D Character Animation From a Single Photo,,,,,,,,Wednesday,Poster 2.1,167,Chung-Yi Weng,\r\n740,Adversarial Defense by Stratified Convolutional Sparse Coding,,,,,,,,Thursday,Poster 3.2,48,Bo Sun,\r\n741,Customizable Architecture Search for Semantic Segmentation,,,,,,,,Thursday,Poster 3.2,68,Yiheng Zhang,\r\n742,Neuro-Inspired Eye Tracking With Eye Movement Dynamics,,,,,,,,Thursday,Poster 3.1,120,Kang Wang,\r\n743,On Learning Density Aware Embeddings,,,,,,,,Wednesday,Poster 2.1,33,Soumyadeep Ghosh,\r\n744,Accelerating Convolutional Neural Networks via Activation Map Compression,,,,,,,,Wednesday,Poster 2.2,16,Georgios Georgiadis,\r\n745,Characterizing and Avoiding Negative Transfer,,,,,,,,Thursday,Poster 3.2,32,Zirui Wang,\r\n746,Weakly Supervised Image Classification Through Noise Regularization,,,,,,,,Thursday,Poster 3.2,55,Mengying Hu,\r\n748,You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection,,,,,,,,Thursday,Poster 3.1,48,Krishna Kumar Singh,\r\n749,On Implicit Filter Level Sparsity in Convolutional Neural Networks,,,,,,,,Tuesday,Poster 1.1,29,Dushyant Mehta,\r\n750,T-Net: Parametrizing Fully Convolutional Nets With a Single High-Order Tensor,,,,,,,,Wednesday,Poster 2.2,110,Jean Kossaifi,\r\n751,Learning Binary Code for Personalized Fashion Recommendation,,,,,,,,Thursday,Poster 3.1,210,Zhi Lu,\r\n752,Hyperspectral Image Super-Resolution With Optimized RGB Guidance,,,,,,,,Thursday,Poster 3.2,70,Ying Fu,\r\n753,Learning Active Contour Models for Medical Image Segmentation,,,,,,,,Thursday,Poster 3.2,67,Xu Chen,\r\n754,Graph Attention Convolution for Point Cloud Semantic Segmentation,,,,,,,,Thursday,Poster 3.1,168,Lei Wang,\r\n755,Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation,,,,,,,,Thursday,Poster 3.2,136,Artur Grigorev,\r\n756,Detection Based Defense Against Adversarial Examples From the Steganalysis Point of View,,,,,,,,Wednesday,Poster 2.1,27,Jiayang Liu,\r\n757,A Parametric Top-View Representation of Complex Road Scenes,,,,,,,,Thursday,Poster 3.1,171,Ziyan Wang,\r\n758,Unsupervised Learning of Consensus Maximization for 3D Vision Problems,,,,,,,,Tuesday,Poster 1.1,70,Thomas Probst,\r\n759,LVIS: A Dataset for Large Vocabulary Instance Segmentation,,,,,,,,Wednesday,Poster 2.1,81,Agrim Gupta,\r\n761,Audio Visual Scene-Aware Dialog,,,,,,,,Wednesday,Poster 2.2,83,Huda Alamri,\r\n762,Graph-Based Global Reasoning Networks,,,,,,,,Tuesday,Poster 1.1,20,Yunpeng Chen,\r\n763,Adversarial Structure Matching for Structured Prediction Tasks,,,,,,,,Tuesday,Poster 1.2,195,Jyh-Jing Hwang,\r\n764,\"Face Anti-Spoofing: Model Matters, so Does Data\",,,,,,,,Tuesday,Poster 1.2,103,Xiao Yang,\r\n765,Unsupervised Domain-Specific Deblurring via Disentangled Representations,,,,,,,,Thursday,Poster 3.1,161,Boyu Lu,\r\n766,Propagation Mechanism for Deep and Wide Neural Networks,,,,,,,,Thursday,Poster 3.1,28,Dejiang Xu,\r\n767,MirrorGAN: Learning Text-To-Image Generation by Redescription,,,,,,,,Tuesday,Poster 1.1,152,Tingting Qiao,\"Tingting Qiao, Jing Zhang, Duanqing Xu, Dacheng Tao\"\r\n768,What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions,,,,,,,,Tuesday,Poster 1.2,205,Ehsan Abbasnejad,\r\n769,Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering,,,,,,,,Tuesday,Poster 1.1,96,Seungryul Baek,\r\n770,Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing,,,,,,,,Tuesday,Poster 1.1,197,Xihui Liu,\r\n771,ShieldNets: Defending Against Adversarial Attacks Using Probabilistic Adversarial Robustness,,,,,,,,Wednesday,Poster 2.2,6,Rajkumar Theagarajan,\r\n772,Semi-Supervised Transfer Learning for Image Rain Removal,,,,,,,,Tuesday,Poster 1.2,159,Wei Wei,\r\n773,Assessing Personally Perceived Image Quality via Image Features and Collaborative Filtering,,,,,,,,Wednesday,Poster 2.2,163,Jari Korhonen,\r\n774,Layout-Graph Reasoning for Fashion Landmark Detection,,,,,,,,Tuesday,Poster 1.2,46,Weijiang Yu,\r\n775,RES-PCA: A Scalable Approach to Recovering Low-Rank Matrices,,,,,,,,Wednesday,Poster 2.2,58,Chong Peng,\r\n776,Using Unknown Occluders to Recover Hidden Scenes,,,,,,,,Thursday,Poster 3.2,146,Adam B. Yedidia,\r\n778,Turn a Silicon Camera Into an InGaAs Camera,,,,,,,,Wednesday,Poster 2.1,175,Feifan Lv,\r\n779,Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering,,,,,,,,Thursday,Poster 3.1,52,Chen Zhang,\r\n780,D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation,,,,,,,,Tuesday,Poster 1.2,107,Chien-Yi Chang,\r\n781,Learning Channel-Wise Interactions for Binary Convolutional Neural Networks,,,,,,,,Tuesday,Poster 1.1,34,Ziwei Wang,\r\n782,Deep Multimodal Clustering for Unsupervised Audiovisual Learning,,,,,,,,Thursday,Poster 3.1,31,Di Hu,\r\n783,Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting,,,,,,,,Thursday,Poster 3.2,215,Muming Zhao,\r\n784,AdaFrame: Adaptive Frame Selection for Fast Video Recognition,,,,,,,,Tuesday,Poster 1.1,129,Zuxuan Wu,\r\n785,World From Blur,,,,,,,,Wednesday,Poster 2.2,214,Jiayan Qiu,\r\n786,EventNet: Asynchronous Recursive Event Processing,,,,,,,,Tuesday,Poster 1.2,160,Yusuke Sekikawa,\r\n787,Towards Accurate One-Stage Object Detection With AP-Loss,,,,,,,,Wednesday,Poster 2.1,57,Kean Chen,\r\n788,Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization,,,,,,,,Tuesday,Poster 1.1,131,Daochang Liu,\r\n789,Progressive Feature Alignment for Unsupervised Domain Adaptation,,,,,,,,Tuesday,Poster 1.1,40,Chaoqi Chen,\r\n790,Domain-Symmetric Networks for Adversarial Domain Adaptation,,,,,,,,Wednesday,Poster 2.1,48,Yabin Zhang,\r\n791,Multi-Granularity Generator for Temporal Action Proposal,,,,,,,,Tuesday,Poster 1.2,113,Yuan Liu,\r\n792,Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks,,,,,,,,Thursday,Poster 3.1,15,Jorg Wagner,\r\n793,Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up,,,,,,,,Tuesday,Poster 1.2,56,Weifeng Ge,\r\n794,Hierarchical Disentanglement of Discriminative Latent Features for Zero-Shot Learning,,,,,,,,Thursday,Poster 3.2,50,Bin Tong,\r\n795,Classification-Reconstruction Learning for Open-Set Recognition,,,,,,,,Tuesday,Poster 1.2,191,Ryota Yoshihashi,\r\n796,A Variational Pan-Sharpening With Local Gradient Constraints,,,,,,,,Thursday,Poster 3.1,165,Xueyang Fu,\r\n797,Non-Adversarial Image Synthesis With Generative Latent Nearest Neighbors,,,,,,,,Wednesday,Poster 2.1,157,Yedid Hoshen,\r\n798,An End-To-End Network for Panoptic Segmentation,,,,,,,,Wednesday,Poster 2.1,194,Huanyu Liu,\r\n799,ToothNet: Automatic Tooth Instance Segmentation and Identification From Cone Beam CT Images,,,,,,,,Wednesday,Poster 2.1,214,Zhiming Cui,\r\n800,Unsupervised Disentangling of Appearance and Geometry by Deformable Generator Network,,,,,,,,Thursday,Poster 3.1,174,Xianglei Xing,\r\n801,A Variational Auto-Encoder Model for Stochastic Point Processes,,,,,,,,Tuesday,Poster 1.2,69,Nazanin Mehrasa,\r\n802,Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation,,,,,,,,Tuesday,Poster 1.1,47,Yunhang Shen,\r\n803,RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection,,,,,,,,Wednesday,Poster 2.1,65,Leonid Karlinsky,\r\n804,Deep ChArUco: Dark ChArUco Marker Pose Estimation,,,,,,,,Wednesday,Poster 2.2,208,Danying Hu,\r\n805,Learning-Based Sampling for Natural Image Matting,,,,,,,,Tuesday,Poster 1.2,58,Jingwei Tang,\r\n806,Actively Seeking and Learning From Live Data,,,,,,,,Tuesday,Poster 1.1,196,Damien Teney,\r\n807,Depth Coefficients for Depth Completion,,,,,,,,Thursday,Poster 3.2,186,Saif Imran,\r\n808,PDE Acceleration for Active Contours,,,,,,,,Thursday,Poster 3.2,173,Anthony Yezzi,\r\n809,On the Intrinsic Dimensionality of Image Representations,,,,,,,,Tuesday,Poster 1.2,188,Sixue Gong,\r\n810,PA3D: Pose-Action 3D Machine for Video Recognition,,,,,,,,Wednesday,Poster 2.2,120,An Yan,\r\n811,UniformFace: Learning Deep Equidistributed Representation for Face Recognition,,,,,,,,Tuesday,Poster 1.2,94,Yueqi Duan,\r\n812,Direct Object Recognition Without Line-Of-Sight Using Optical Coherence,,,,,,,,Thursday,Poster 3.2,78,Xin Lei,\r\n813,Deep Dual Relation Modeling for Egocentric Interaction Recognition,,,,,,,,Wednesday,Poster 2.2,121,Haoxin Li,\r\n814,Learning Independent Object Motion From Unlabelled Stereoscopic Videos,,,,,,,,Wednesday,Poster 2.1,105,Zhe Cao,\r\n815,Spectral Metric for Dataset Complexity Assessment,,,,,,,,Tuesday,Poster 1.2,74,Frederic Branchaud-Charron,\r\n816,Progressive Ensemble Networks for Zero-Shot Recognition,,,,,,,,Thursday,Poster 3.2,77,Meng Ye,\r\n818,Grid R-CNN,,,,,,,,Wednesday,Poster 2.2,63,Xin Lu,\r\n819,Spatial-Aware Graph Relation Network for Large-Scale Object Detection,,,,,,,,Thursday,Poster 3.1,36,Hang Xu,\r\n821,Cross-Classification Clustering: An Efficient Multi-Object Tracking Technique for 3-D Instance Segmentation in Connectomics,,,,,,,,Wednesday,Poster 2.2,207,Yaron Meirovitch,\r\n822,ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model,,,,,,,,Thursday,Poster 3.2,23,Haichuan Yang,\r\n823,Led3D: A Lightweight and Efficient Deep Approach to Recognizing Low-Quality 3D Faces,,,,,,,,Wednesday,Poster 2.1,153,Guodong Mu,\r\n824,MSCap: Multi-Style Image Captioning With Unpaired Stylized Text,,,,,,,,Tuesday,Poster 1.2,210,Longteng Guo,\r\n826,DDLSTM: Dual-Domain LSTM for Cross-Dataset Action Recognition,,,,,,,,Wednesday,Poster 2.2,113,Toby Perrett,\r\n827,Weakly Supervised Open-Set Domain Adaptation by Dual-Domain Collaboration,,,,,,,,Wednesday,Poster 2.1,85,Shuhan Tan,\r\n828,MARS: Motion-Augmented RGB Stream for Action Recognition,,,,,,,,Wednesday,Poster 2.2,116,Nieves Crasto,\r\n829,Less Is More: Learning Highlight Detection From Video Duration,,,,,,,,Tuesday,Poster 1.1,127,Bo Xiong,\r\n830,3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis,,,,,,,,Wednesday,Poster 2.2,95,Xiaojuan Qi,\r\n831,Distraction-Aware Shadow Detection,,,,,,,,Wednesday,Poster 2.1,62,Quanlong Zheng,\r\n832,Unifying Heterogeneous Classifiers With Distillation,,,,,,,,Tuesday,Poster 1.2,70,Jayakorn Vongkulbhisal,\r\n833,Self-Supervised Representation Learning by Rotation Feature Decoupling,,,,,,,,Thursday,Poster 3.1,175,Zeyu Feng,\r\n834,Compressing Convolutional Neural Networks via Factorized Convolutional Filters,,,,,,,,Tuesday,Poster 1.2,187,Tuanhui Li,\r\n835,Discovering Fair Representations in the Data Domain,,,,,,,,Wednesday,Poster 2.2,169,Novi Quadrianto,\r\n836,\"Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation\",,,,,,,,Tuesday,Poster 1.2,200,Zhenyu Zhang,\r\n837,Speech2Face: Learning the Face Behind a Voice,,,,,,,,Wednesday,Poster 2.2,81,Tae-Hyun Oh,\r\n838,Informative Object Annotations: Tell Me Something I Don't Know,,,,,,,,Thursday,Poster 3.2,192,Lior Bracha,\r\n839,DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis,,,,,,,,Wednesday,Poster 2.1,156,Minfeng Zhu,\r\n840,Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild,,,,,,,,Tuesday,Poster 1.2,1,Luan Tran,\r\n841,3D Local Features for Direct Pairwise Registration,,,,,,,,Tuesday,Poster 1.2,77,Haowen Deng,\r\n842,Unsupervised Primitive Discovery for Improved 3D Generative Modeling,,,,,,,,Thursday,Poster 3.1,111,Salman H. Khan,\r\n843,Learning to Minify Photometric Stereo,,,,,,,,Wednesday,Poster 2.2,84,Junxuan Li,\r\n844,Learning to Calibrate Straight Lines for Fisheye Image Rectification,,,,,,,,Tuesday,Poster 1.1,166,Zhucun Xue,\r\n845,Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net,,,,,,,,Tuesday,Poster 1.1,160,Qi Xie,\r\n847,Selective Sensor Fusion for Neural Visual-Inertial Odometry,,,,,,,,Thursday,Poster 3.1,208,Changhao Chen,\r\n848,SpotTune: Transfer Learning Through Adaptive Fine-Tuning,,,,,,,,Wednesday,Poster 2.1,25,Yunhui Guo,\r\n849,Learning a Deep ConvNet for Multi-Label Classification With Partial Labels,,,,,,,,Tuesday,Poster 1.1,42,Thibaut Durand,\r\n850,Visual Question Answering as Reading Comprehension,,,,,,,,Wednesday,Poster 2.1,209,Hui Li,\r\n851,Densely Semantically Aligned Person Re-Identification,,,,,,,,Tuesday,Poster 1.1,44,Zhizheng Zhang,\r\n852,Distilled Person Re-Identification: Towards a More Scalable System,,,,,,,,Tuesday,Poster 1.1,108,Ancong Wu,\r\n853,Speed Invariant Time Surface for Learning to Detect Corner Points With Event-Based Cameras,,,,,,,,Thursday,Poster 3.1,163,Jacques Manderscheid,\r\n854,EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching From Scratch,,,,,,,,Thursday,Poster 3.1,11,Jian Ren,\r\n855,Weakly Supervised Person Re-Identification,,,,,,,,Tuesday,Poster 1.1,53,Jingke Meng,\r\n856,Motion Estimation of Non-Holonomic Ground Vehicles From a Single Feature Correspondence Measured Over N Views,,,,,,,,Thursday,Poster 3.2,212,Kun Huang,\r\n857,How to Make a Pizza: Learning a Compositional Layer-Based GAN Model,,,,,,,,Wednesday,Poster 2.2,128,Dim P. Papadopoulos,\r\n858,Few-Shot Learning via Saliency-Guided Hallucination of Samples,,,,,,,,Tuesday,Poster 1.2,11,Hongguang Zhang,\r\n859,Triply Supervised Decoder Networks for Joint Detection and Segmentation,,,,,,,,Wednesday,Poster 2.2,66,Jiale Cao,\r\n860,OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge,,,,,,,,Tuesday,Poster 1.2,72,Kenneth Marino,\r\n861,Dynamic Recursive Neural Network,,,,,,,,Wednesday,Poster 2.1,60,Qiushan Guo,\r\n862,Fully Learnable Group Convolution for Acceleration of Deep Neural Networks,,,,,,,,Thursday,Poster 3.1,10,Xijun Wang,\r\n863,Sea-Thru: A Method for Removing Water From Underwater Images,,,,,,,,Tuesday,Poster 1.1,170,Derya Akkaynak,\r\n864,Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness Against Adversarial Attack,,,,,,,,Tuesday,Poster 1.1,36,Zhezhi He,\r\n866,The Domain Transform Solver,,,,,,,,Wednesday,Poster 2.1,178,Akash Bapat,\r\n867,A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes,,,,,,,,Thursday,Poster 3.2,183,Lichao Mou,\r\n868,Attending to Discriminative Certainty for Domain Adaptation,,,,,,,,Tuesday,Poster 1.1,26,Vinod Kumar Kurmi,\r\n869,X2CT-GAN: Reconstructing CT From Biplanar X-Rays With Generative Adversarial Networks,,,,,,,,Thursday,Poster 3.1,216,Xingde Ying,\r\n870,SceneCode: Monocular Dense Semantic Reconstruction Using Learned Encoded Scene Representations,,,,,,,,Thursday,Poster 3.2,82,Shuaifeng Zhi,\r\n871,Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network Using Truncated Gaussian Approximation,,,,,,,,Thursday,Poster 3.2,47,Zhezhi He,\r\n872,A Neural Network Based on SPD Manifold Learning for Skeleton-Based Hand Gesture Recognition,,,,,,,,Thursday,Poster 3.2,126,Xuan Son Nguyen,\r\n873,Hybrid Scene Compression for Visual Localization,,,,,,,,Wednesday,Poster 2.2,93,Federico Camposeco,\r\n874,Face-Focused Cross-Stream Network for Deception Detection in Videos,,,,,,,,Wednesday,Poster 2.2,108,Mingyu Ding,\r\n875,SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization,,,,,,,,Thursday,Poster 3.2,24,Shijie Cao,\r\n876,Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks,,,,,,,,Thursday,Poster 3.2,31,Seungjoo Yoo,\r\n877,Point Cloud Oversegmentation With Graph-Structured Deep Metric Learning,,,,,,,,Wednesday,Poster 2.2,71,Loic Landrieu,\r\n878,Biologically-Constrained Graphs for Global Connectomics Reconstruction,,,,,,,,Tuesday,Poster 1.1,211,Brian Matejek,\r\n879,Learning Image and Video Compression Through Spatial-Temporal Energy Compaction,,,,,,,,Thursday,Poster 3.1,145,Zhengxue Cheng,\r\n880,A Variational EM Framework With Adaptive Edge Selection for Blind Motion Deblurring,,,,,,,,Thursday,Poster 3.1,155,Liuge Yang,\r\n881,SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation - A Synthetic Dataset and Baselines,,,,,,,,Tuesday,Poster 1.2,63,Yuan-Ting Hu,\r\n882,Skeleton-Based Action Recognition With Directed Graph Neural Networks,,,,,,,,Wednesday,Poster 2.2,119,Lei Shi,\r\n883,SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking,,,,,,,,Tuesday,Poster 1.2,117,Guangting Wang,\r\n884,Beyond Volumetric Albedo -- A Surface Optimization Framework for Non-Line-Of-Sight Imaging,,,,,,,,Tuesday,Poster 1.1,156,Chia-Yin Tsai,\r\n885,What Do Single-View 3D Reconstruction Networks Learn?,,,,,,,,Tuesday,Poster 1.2,93,Maxim Tatarchenko,\r\n886,Collaborative Spatiotemporal Feature Learning for Video Action Recognition,,,,,,,,Wednesday,Poster 2.2,115,Chao Li,\r\n887,Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms,,,,,,,,Wednesday,Poster 2.1,176,Canyi Lu,\r\n888,Kernel Transformer Networks for Compact Spherical Convolution,,,,,,,,Thursday,Poster 3.1,51,Yu-Chuan Su,\r\n889,Cross-Atlas Convolution for Parameterization Invariant Learning on Textured Mesh Surface,,,,,,,,Wednesday,Poster 2.1,191,Shiwei Li,\r\n890,DeepLight: Learning Illumination for Unconstrained Mobile Mixed Reality,,,,,,,,Wednesday,Poster 2.1,168,Chloe LeGendre,\r\n891,Attribute-Aware Face Aging With Wavelet-Based Generative Adversarial Networks,,,,,,,,Thursday,Poster 3.2,110,Yunfan Liu,\r\n892,\"Mapping, Localization and Path Planning for Image-Based Navigation Using Visual Features and Map\",,,,,,,,Wednesday,Poster 2.2,65,Janine Thoma,\r\n893,Interactive Full Image Segmentation by Considering All Regions Jointly,,,,,,,,Thursday,Poster 3.2,66,Eirikur Agustsson,\r\n894,A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning,,,,,,,,Thursday,Poster 3.1,179,Thanh-Toan Do,\r\n895,Answer Them All! Toward Universal Visual Question Answering Models,,,,,,,,Thursday,Poster 3.1,186,Robik Shrestha,\r\n896,Dynamic Scene Deblurring With Parameter Selective Sharing and Nested Skip Connections,,,,,,,,Tuesday,Poster 1.2,156,Hongyun Gao,\r\n898,R3 Adversarial Network for Cross Model Face Recognition,,,,,,,,Thursday,Poster 3.1,124,Ken Chen,\r\n899,Deep Global Generalized Gaussian Networks,,,,,,,,Wednesday,Poster 2.1,53,Qilong Wang,\r\n900,Toward Realistic Image Compositing With Adversarial Learning,,,,,,,,Wednesday,Poster 2.2,206,Bor-Chun Chen,\r\n901,Multi-Person Pose Estimation With Enhanced Channel-Wise and Spatial Information,,,,,,,,Wednesday,Poster 2.1,128,Kai Su,\r\n903,Exploiting Edge Features for Graph Neural Networks,,,,,,,,Thursday,Poster 3.1,27,Liyu Gong,\r\n904,Versatile Multiple Choice Learning and Its Application to Vision Computing,,,,,,,,Wednesday,Poster 2.1,212,Kai Tian,\r\n905,Jumping Manifolds: Geometry Aware Dense Non-Rigid Structure From Motion,,,,,,,,Wednesday,Poster 2.1,80,Suryansh Kumar,\r\n906,Semi-Supervised Learning With Graph Learning-Convolutional Networks,,,,,,,,Thursday,Poster 3.2,34,Bo Jiang,\r\n907,Scene Graph Generation With External Knowledge and Image Reconstruction,,,,,,,,Tuesday,Poster 1.1,199,Jiuxiang Gu,\r\n908,Large Scale High-Resolution Land Cover Mapping With Multi-Resolution Data,,,,,,,,Thursday,Poster 3.2,214,Caleb Robinson,\r\n909,Re-Identification With Consistent Attentive Siamese Networks,,,,,,,,Wednesday,Poster 2.1,134,Meng Zheng,\r\n910,You Look Twice: GaterNet for Dynamic Filter Selection in CNNs,,,,,,,,Thursday,Poster 3.1,23,Zhourong Chen,\r\n911,Understanding the Limitations of CNN-Based Absolute Camera Pose Regression,,,,,,,,Tuesday,Poster 1.2,83,Torsten Sattler,\r\n912,GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation,,,,,,,,Wednesday,Poster 2.2,173,Xinhong Ma,\r\n913,\"Catastrophic Child's Play: Easy to Perform, Hard to Defend Adversarial Attacks\",,,,,,,,Thursday,Poster 3.1,29,Chih-Hui Ho,\r\n915,Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization,,,,,,,,Tuesday,Poster 1.1,123,Chenchen Liu,\r\n916,Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition,,,,,,,,Wednesday,Poster 2.2,99,Yinpeng Dong,\r\n918,Networks for Joint Affine and Non-Parametric Image Registration,,,,,,,,Tuesday,Poster 1.2,212,Zhengyang Shen,\r\n919,Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification,,,,,,,,Tuesday,Poster 1.1,51,Song Bai,\r\n922,Iterative Alignment Network for Continuous Sign Language Recognition,,,,,,,,Tuesday,Poster 1.2,206,Junfu Pu,\r\n923,A Neurobiological Evaluation Metric for Neural Network Model Search,,,,,,,,Wednesday,Poster 2.1,86,Nathaniel Blanchard,\r\n924,Reflective and Fluorescent Separation Under Narrow-Band Illumination,,,,,,,,Wednesday,Poster 2.2,85,Koji Koyamatsu,\r\n925,Text Guided Person Image Synthesis,,,,,,,,Tuesday,Poster 1.2,137,Xingran Zhou,\r\n926,An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection,,,,,,,,Wednesday,Poster 2.1,173,Wenguan Wang,\r\n927,Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs,,,,,,,,Wednesday,Poster 2.2,176,Qi Zhang,\r\n928,Adaptive Pyramid Context Network for Semantic Segmentation,,,,,,,,Wednesday,Poster 2.2,79,Junjun He,\r\n929,Blind Geometric Distortion Correction on Images Through Deep Learning,,,,,,,,Wednesday,Poster 2.1,30,Xiaoyu Li,\r\n930,Listen to the Image,,,,,,,,Wednesday,Poster 2.2,125,Di Hu,\r\n931,Relational Knowledge Distillation,,,,,,,,Tuesday,Poster 1.2,186,Wonpyo Park,\r\n932,An Efficient Schmidt-EKF for 3D Visual-Inertial SLAM,,,,,,,,Thursday,Poster 3.2,133,Patrick Geneva,\r\n933,Pay Attention! - Robustifying a Deep Visuomotor Policy Through Task-Focused Visual Attention,,,,,,,,Tuesday,Poster 1.2,215,Pooya Abolghasemi,\r\n934,Scalable Convolutional Neural Network for Image Compressed Sensing,,,,,,,,Thursday,Poster 3.2,170,Wuzhen Shi,\r\n935,TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection,,,,,,,,Thursday,Poster 3.2,121,Lin Song,\r\n936,Memory-Attended Recurrent Network for Video Captioning,,,,,,,,Wednesday,Poster 2.2,199,Wenjie Pei,\r\n937,UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos,,,,,,,,Wednesday,Poster 2.2,153,Yang Wang,\r\n938,Object Discovery in Videos as Foreground Motion Clustering,,,,,,,,Thursday,Poster 3.1,137,Christopher Xie,\r\n939,Deep Supervised Cross-Modal Retrieval,,,,,,,,Thursday,Poster 3.1,178,Liangli Zhen,\r\n940,Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification,,,,,,,,Tuesday,Poster 1.1,39,Zhixiang Wang,\r\n941,\"Robustness via Curvature Regularization, and Vice Versa\",,,,,,,,Thursday,Poster 3.1,13,Seyed-Mohsen Moosavi-Dezfooli,\r\n942,Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks,,,,,,,,Tuesday,Poster 1.1,149,Minyoung Huh,\r\n943,Learning to Explore Intrinsic Saliency for Stereoscopic Video,,,,,,,,Thursday,Poster 3.1,112,Qiudan Zhang,\r\n944,Ensemble Deep Manifold Similarity Learning Using Hard Proxies,,,,,,,,Wednesday,Poster 2.2,56,Nicolas Aziere,\r\n945,Hyperspectral Imaging With Random Printed Mask,,,,,,,,Thursday,Poster 3.1,153,Yuanyuan Zhao,\r\n946,Knowledge Distillation via Instance Relationship Graph,,,,,,,,Wednesday,Poster 2.2,17,Yufan Liu,\r\n947,Convolutional Recurrent Network for Road Boundary Extraction,,,,,,,,Thursday,Poster 3.1,73,Justin Liang,\r\n948,ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving,,,,,,,,Wednesday,Poster 2.1,91,Xibin Song,\r\n949,LO-Net: Deep Real-Time Lidar Odometry,,,,,,,,Wednesday,Poster 2.2,212,Qing Li,\r\n950,Building Efficient Deep Neural Networks With Unitary Group Convolutions,,,,,,,,Thursday,Poster 3.2,33,Ritchie Zhao,\r\n951,Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes,,,,,,,,Thursday,Poster 3.1,209,Chengquan Zhang,\r\n952,BeautyGlow: On-Demand Makeup Transfer Framework With Reversible Generative Network,,,,,,,,Thursday,Poster 3.1,142,Hung-Jen Chen,\r\n953,Learning Instance Activation Maps for Weakly Supervised Instance Segmentation,,,,,,,,Tuesday,Poster 1.2,64,Yi Zhu,\r\n954,Deep Embedding Learning With Discriminative Sampling Policy,,,,,,,,Wednesday,Poster 2.1,41,Yueqi Duan,\r\n955,Events-To-Video: Bringing Modern Computer Vision to Event Cameras,,,,,,,,Tuesday,Poster 1.2,157,Henri Rebecq,\r\n956,Improving Action Localization by Progressive Cross-Stream Cooperation,,,,,,,,Thursday,Poster 3.2,124,Rui Su,\r\n958,Decorrelated Adversarial Learning for Age-Invariant Face Recognition,,,,,,,,Tuesday,Poster 1.2,105,Hao Wang,\r\n959,Neural Sequential Phrase Grounding (SeqGROUND),,,,,,,,Tuesday,Poster 1.2,207,Pelin Dogan,\"Pelin Dogan, Leonid Sigal, Markus Gross\"\r\n961,Towards Rich Feature Discovery With Class Activation Maps Augmentation for Person Re-Identification,,,,,,,,Tuesday,Poster 1.1,140,Wenjie Yang,\r\n962,Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images,,,,,,,,Tuesday,Poster 1.1,210,Yi Zhou,\r\n963,Progressive Teacher-Student Learning for Early Action Prediction,,,,,,,,Tuesday,Poster 1.2,108,Xionghui Wang,\r\n964,Unsupervised Domain Adaptation for ToF Data Denoising With Adversarial Learning,,,,,,,,Wednesday,Poster 2.1,104,Gianluca Agresti,\r\n965,Enhanced Bayesian Compression via Deep Reinforcement Learning,,,,,,,,Wednesday,Poster 2.2,2,Xin Yuan,\r\n966,GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving,,,,,,,,Tuesday,Poster 1.1,91,Buyu Li,\r\n967,Generalizable Person Re-Identification by Domain-Invariant Mapping Network,,,,,,,,Tuesday,Poster 1.1,49,Jifei Song,\r\n968,Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks,,,,,,,,Tuesday,Poster 1.2,5,Eunwoo Kim,\r\n969,Learning to Learn Relation for Important People Detection in Still Images,,,,,,,,Wednesday,Poster 2.1,45,Wei-Hong Li,\r\n970,Thinking Outside the Pool: Active Training Image Creation for Relative Attributes,,,,,,,,Tuesday,Poster 1.1,48,Aron Yu,\r\n971,Metric Learning for Image Registration,,,,,,,,Wednesday,Poster 2.2,211,Marc Niethammer,\r\n972,Dissimilarity Coefficient Based Weakly Supervised Object Detection,,,,,,,,Thursday,Poster 3.1,50,Aditya Arun,\r\n973,Semantic Component Decomposition for Face Attribute Manipulation,,,,,,,,Thursday,Poster 3.1,123,Ying-Cong Chen,\r\n974,Local to Global Learning: Gradually Adding Classes for Training Deep Neural Networks,,,,,,,,Wednesday,Poster 2.1,19,Hao Cheng,\r\n975,Combinatorial Persistency Criteria for Multicut and Max-Cut,,,,,,,,Wednesday,Poster 2.1,186,Jan-Hendrik Lange,\r\n976,Cascaded Generative and Discriminative Learning for Microcalcification Detection in Breast Mammograms,,,,,,,,Thursday,Poster 3.2,199,Fandong Zhang,\r\n977,Towards Robust Curve Text Detection With Conditional Spatial Expansion,,,,,,,,Wednesday,Poster 2.2,53,Zichuan Liu,\r\n978,Geometry-Aware Distillation for Indoor Semantic Segmentation,,,,,,,,Tuesday,Poster 1.2,39,Jianbo Jiao,\r\n980,Deep Exemplar-Based Video Colorization,,,,,,,,Wednesday,Poster 2.2,151,Bo Zhang,\r\n981,RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion,,,,,,,,Wednesday,Poster 2.2,97,Jie Li,\r\n982,Progressive Attention Memory Network for Movie Story Question Answering,,,,,,,,Wednesday,Poster 2.2,198,Junyeong Kim,\r\n983,Semantic Attribute Matching Networks,,,,,,,,Thursday,Poster 3.2,175,Seungryong Kim,\r\n984,Conditional Adversarial Generative Flow for Controllable Image Synthesis,,,,,,,,Wednesday,Poster 2.2,127,Rui Liu,\r\n985,Task Agnostic Meta-Learning for Few-Shot Learning,,,,,,,,Thursday,Poster 3.2,76,Muhammad Abdullah Jamal,\r\n986,Deep Spectral Clustering Using Dual Autoencoder Network,,,,,,,,Tuesday,Poster 1.2,196,Xu Yang,\r\n988,\"What Does It Mean to Learn in Deep Networks? And, How Does One Detect Adversarial Attacks?\",,,,,,,,Wednesday,Poster 2.1,20,Ciprian A. Corneanu,\r\n989,Context-Aware Spatio-Recurrent Curvilinear Structure Segmentation,,,,,,,,Thursday,Poster 3.2,206,Feigege Wang,\r\n990,Local Features and Visual Words Emerge in Activations,,,,,,,,Thursday,Poster 3.2,69,Oriane Simeoni,\r\n991,Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping,,,,,,,,Thursday,Poster 3.2,184,Suhas Lohit,\r\n992,Zoom-In-To-Check: Boosting Video Interpolation via Instance-Level Discrimination,,,,,,,,Thursday,Poster 3.2,141,Liangzhe Yuan,\r\n993,Adversarial Defense Through Network Profiling Based Path Extraction,,,,,,,,Wednesday,Poster 2.1,22,Yuxian Qiu,\r\n994,Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection,,,,,,,,Tuesday,Poster 1.2,99,Zhiwei Liu,\r\n995,Arbitrary Style Transfer With Style-Attentional Networks,,,,,,,,Wednesday,Poster 2.1,164,Dae Young Park,\r\n996,Learning a Unified Classifier Incrementally via Rebalancing,,,,,,,,Tuesday,Poster 1.1,60,Saihui Hou,\r\n997,Event-Based High Dynamic Range Image and Very High Frame Rate Video Generation Using Conditional Generative Adversarial Networks,,,,,,,,Thursday,Poster 3.1,146,Lin Wang,\r\n998,Single Image Reflection Removal Beyond Linearity,,,,,,,,Tuesday,Poster 1.2,148,Qiang Wen,\r\n999,Robustness Verification of Classification Deep Neural Networks via Linear Programming,,,,,,,,Thursday,Poster 3.2,45,Wang Lin,\r\n1000,Generalising Fine-Grained Sketch-Based Image Retrieval,,,,,,,,Tuesday,Poster 1.1,45,Kaiyue Pang,\r\n1001,ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation,,,,,,,,Thursday,Poster 3.2,43,Xiaoliang Dai,\r\n1002,Ray-Space Projection Model for Light Field Camera,,,,,,,,Thursday,Poster 3.1,150,Qi Zhang,\r\n1003,Learning to Detect Human-Object Interactions With Knowledge,,,,,,,,Tuesday,Poster 1.1,204,Bingjie Xu,\r\n1004,End-To-End Supervised Product Quantization for Image Search and Retrieval,,,,,,,,Wednesday,Poster 2.1,49,Benjamin Klein,\r\n1005,RAVEN: A Dataset for Relational and Analogical Visual REasoNing,,,https://wellyzhang.github.io/project/raven.html,,,,,Wednesday,Poster 2.1,77,Chi Zhang,\"Chi Zhang, Feng Gao, Baoxiong Jia, Yixin Zhu, Song-Chun Zhu\"\r\n1006,Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation,,,,,,,,Tuesday,Poster 1.1,147,Yazeed Alharbi,\r\n1007,Efficient Featurized Image Pyramid Network for Single Shot Detector,,,,,,,,Wednesday,Poster 2.2,60,Yanwei Pang,\r\n1008,Data Representation and Learning With Graph Diffusion-Embedding Networks,,,,,,,,Thursday,Poster 3.1,180,Bo Jiang,\r\n1009,Enhanced Pix2pix Dehazing Network,,,,,,,,Wednesday,Poster 2.2,162,Yanyun Qu,\r\n1010,FickleNet: Weakly and Semi-Supervised Semantic Image Segmentation Using Stochastic Inference,,,,,,,,Wednesday,Poster 2.1,72,Jungbeom Lee,\r\n1011,Adaptive Confidence Smoothing for Generalized Zero-Shot Learning,,,,,,,,Thursday,Poster 3.2,71,Yuval Atzmon,\r\n1012,Attentive Relational Networks for Mapping Images to Scene Graphs,,,,,,,,Tuesday,Poster 1.2,185,Mengshi Qi,\r\n1013,Multi-Task Learning of Hierarchical Vision-Language Representation,,,,,,,,Thursday,Poster 3.1,188,Duy-Kien Nguyen,\r\n1014,Moving Object Detection Under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition,,,,,,,,Wednesday,Poster 2.2,48,Moein Shakeri,\r\n1015,A Content Transformation Block for Image Style Transfer,,,,,,,,Thursday,Poster 3.1,141,Dmytro Kotovenko,\r\n1016,Knockoff Nets: Stealing Functionality of Black-Box Models,,,,,,,,Wednesday,Poster 2.1,40,Tribhuvanesh Orekondy,\r\n1017,Unsupervised Moving Object Detection via Contextual Information Separation,,,,,,,,Tuesday,Poster 1.1,65,Yanchao Yang,\r\n1018,P2SGrad: Refined Gradients for Optimizing Deep Face Models,,,,,,,,Thursday,Poster 3.1,128,Xiao Zhang,\r\n1019,Fast Spatio-Temporal Residual Network for Video Super-Resolution,,,,,,,,Thursday,Poster 3.1,206,Sheng Li,\r\n1020,Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks,,,,,,,,Thursday,Poster 3.2,204,Stephen James,\r\n1021,Volumetric Capture of Humans With a Single RGBD Camera via Semi-Parametric Learning,,,,,,,,Thursday,Poster 3.1,108,Rohit Pandey,\r\n1022,Mind Your Neighbours: Image Annotation With Metadata Neighbourhood Graph Co-Attention Networks,,,,,,,,Tuesday,Poster 1.2,48,Junjie Zhang,\r\n1023,Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling,,,,,,,,Tuesday,Poster 1.2,85,Jiancheng Yang,\"Jiancheng Yang, Qiang Zhang, Bingbing Ni, Linguo Li, Jinxian Liu, Mengdie Zhou, Qi Tian\"\r\n1024,Fast User-Guided Video Object Segmentation by Interaction-And-Propagation Networks,,,,,,,,Wednesday,Poster 2.1,70,Seoung Wug Oh,\r\n1025,Explicit Spatial Encoding for Deep Local Descriptors,,,,,,,,Thursday,Poster 3.1,46,Arun Mukundan,\r\n1026,Learning Shape-Aware Embedding for Scene Text Detection,,,,,,,,Tuesday,Poster 1.2,213,Zhuotao Tian,\r\n1027,A Structured Model for Action Detection,,,,,,,,Thursday,Poster 3.1,135,Yubo Zhang,\r\n1028,Scan2Mesh: From Unstructured Range Scans to 3D Meshes,,,,,,,,Wednesday,Poster 2.1,103,Angela Dai,\r\n1029,Deep Single Image Camera Calibration With Radial Distortion,,,,,,,,Thursday,Poster 3.2,86,Manuel Lopez,\r\n1030,Dense Classification and Implanting for Few-Shot Learning,,,,,,,,Thursday,Poster 3.1,32,Yann Lifchitz,\r\n1031,DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image,,,,,,,,Tuesday,Poster 1.2,84,Jiaxiong Qiu,\r\n1032,Object Tracking by Reconstruction With View-Specific Discriminative Correlation Filters,,,,,,,,Tuesday,Poster 1.1,135,Ugur Kart,\r\n1033,MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors,,,,,,,,Thursday,Poster 3.1,42,Lile Cai,\r\n1034,Visual Localization by Learning Objects-Of-Interest Dense Match Regression,,,,,,,,Wednesday,Poster 2.1,124,Philippe Weinzaepfel,\r\n1035,Cross Domain Model Compression by Structurally Weight Sharing,,,,,,,,Thursday,Poster 3.1,2,Shangqian Gao,\r\n1036,Learning to Generate Synthetic Data via Compositing,,,,,,,,Tuesday,Poster 1.1,23,Shashank Tripathi,\r\n1037,Fitting Multiple Heterogeneous Models by Multi-Class Cascaded T-Linkage,,,,,,,,Wednesday,Poster 2.2,73,Luca Magri,\r\n1038,Image-Question-Answer Synergistic Network for Visual Dialog,,,,,,,,Thursday,Poster 3.1,182,Dalu Guo,\r\n1039,Learning Multi-Class Segmentations From Single-Class Datasets,,,,,,,,Thursday,Poster 3.1,72,Konstantin Dmitriev,\r\n1040,Phase-Only Image Based Kernel Estimation for Single Image Blind Deblurring,,,,,,,,Wednesday,Poster 2.1,180,Liyuan Pan,\r\n1041,Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples,,,,,,,,Tuesday,Poster 1.1,63,Zihao Liu,\r\n1042,3D Guided Fine-Grained Face Manipulation,,,,,,,,Thursday,Poster 3.1,119,Zhenglin Geng,\r\n1043,An End-To-End Network for Generating Social Relationship Graphs,,,,,,,,Thursday,Poster 3.2,21,Arushi Goel,\r\n1044,Compact Feature Learning for Multi-Domain Image Classification,,,,,,,,Wednesday,Poster 2.2,45,Yajing Liu,\r\n1045,Privacy Preserving Image-Based Localization,,,,,,,,Wednesday,Poster 2.1,95,Pablo Speciale,\r\n1046,Learning View Priors for Single-View 3D Reconstruction,,,http://hiroharu-kato.com/projects_en/view_prior_learning.html,,,,,Thursday,Poster 3.1,115,Hiroharu Kato,\"Hiroharu Kato, Tatsuya Harada\"\r\n1047,SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints,,,,,,,,Tuesday,Poster 1.1,136,Amir Sadeghian,\r\n1048,Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking,,,,,,,,Wednesday,Poster 2.2,123,Heng Fan,\r\n1049,ClusterNet: Deep Hierarchical Cluster Network With Rigorously Rotation-Invariant Representation for Point Cloud Analysis,,,,,,,,Wednesday,Poster 2.1,44,Chao Chen,\r\n1050,Automatic Face Aging in Videos via Deep Reinforcement Learning,,,,,,,,Thursday,Poster 3.1,139,Chi Nhan Duong,\r\n1051,Unsupervised 3D Pose Estimation With Geometric Self-Supervision,,,,,,,,Wednesday,Poster 2.1,132,Ching-Hang Chen,\r\n1052,Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates,,,,,,,,Tuesday,Poster 1.2,198,Jerome Revaud,\r\n1053,Blind Super-Resolution With Iterative Kernel Correction,,,,,,,,Tuesday,Poster 1.1,162,Jinjin Gu,\r\n1054,DeFusionNET: Defocus Blur Detection via Recurrently Fusing and Refining Multi-Scale Deep Features,,,,,,,,Tuesday,Poster 1.2,4,Chang Tang,\r\n1055,Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training,,,,,,,,Tuesday,Poster 1.1,106,Mahdi Abavisani,\r\n1056,A Late Fusion CNN for Digital Matting,,,,,,,,Wednesday,Poster 2.2,74,Yunke Zhang,\r\n1057,Double Nuclear Norm Based Low Rank Representation on Grassmann Manifolds for Clustering,,,,,,,,Thursday,Poster 3.2,130,Xinglin Piao,\r\n1058,Embedding Complementary Deep Networks for Image Classification,,,,,,,,Thursday,Poster 3.1,30,Qiuyu Chen,\r\n1060,Does Learning Specific Features for Related Parts Help Human Pose Estimation?,,,,,,,,Tuesday,Poster 1.1,100,Wei Tang,\r\n1061,Adversarial Semantic Alignment for Improved Image Captions,,,,,,,,Thursday,Poster 3.1,185,Pierre Dognin,\r\n1062,Learning to Learn Image Classifiers With Visual Analogy,,,,,,,,Thursday,Poster 3.2,53,Linjun Zhou,\r\n1063,VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People,,,,,,,,Tuesday,Poster 1.1,71,Danna Gurari,\r\n1064,Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation,,,,,,,,Tuesday,Poster 1.2,50,Jian Liang,\r\n1065,Fully Quantized Network for Object Detection,,,,,,,,Tuesday,Poster 1.2,15,Rundong Li,\r\n1066,Learning From Noisy Labels by Regularized Estimation of Annotator Confusion,,,,,,,,Thursday,Poster 3.2,27,Ryutaro Tanno,\r\n1068,KE-GAN: Knowledge Embedded Generative Adversarial Networks for Semi-Supervised Scene Parsing,,,,,,,,Wednesday,Poster 2.1,69,Mengshi Qi,\r\n1069,Separate to Adapt: Open Set Domain Adaptation via Progressive Separation,,,,,,,,Tuesday,Poster 1.2,45,Hong Liu,\r\n1070,Spatially Variant Linear Representation Models for Joint Filtering,,,,,,,,Tuesday,Poster 1.1,172,Jinshan Pan,\r\n1071,Building Detail-Sensitive Semantic Segmentation Networks With Polynomial Pooling,,,,,,,,Wednesday,Poster 2.2,19,Zhen Wei,\r\n1072,Towards Universal Object Detection by Domain Attention,,,http://www.svcl.ucsd.edu/projects/universal-detection/?,,,,,Wednesday,Poster 2.2,55,Xudong Wang,\"Xudong Wang, Zhaowei Cai, Dashan Gao, Nuno Vasconcelos\"\r\n1073,Salient Object Detection With Pyramid Attention and Salient Edges,,,,,,,,Tuesday,Poster 1.1,146,Wenguan Wang,\r\n1074,Multi-Adversarial Discriminative Deep Domain Generalization for Face Presentation Attack Detection,,,,,,,,Thursday,Poster 3.1,140,Rui Shao,\r\n1075,CLEVR-Ref+: Diagnosing Visual Reasoning With Referring Expressions,,,,,,,,Tuesday,Poster 1.2,208,Runtao Liu,\r\n1076,A Flexible Convolutional Solver for Fast Style Transfers,,,,,,,,Thursday,Poster 3.1,1,Gilles Puy,\r\n1077,MVTec AD -- A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection,,,,,,,,Thursday,Poster 3.1,96,Paul Bergmann,\r\n1078,MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition,,,,,,,,Wednesday,Poster 2.2,64,Weihe Zhang,\r\n1079,Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction,,,,,,,,Thursday,Poster 3.2,91,Jason Ku,\r\n1080,Contrastive Adaptation Network for Unsupervised Domain Adaptation,,,,,,,,Wednesday,Poster 2.1,34,Guoliang Kang,\r\n1081,\"NetTailor: Tuning the Architecture, Not Just the Weights\",,,,,,,,Tuesday,Poster 1.2,57,Pedro Morgado,\r\n1082,Variational Information Distillation for Knowledge Transfer,,,,,,,,Thursday,Poster 3.1,22,Sungsoo Ahn,\r\n1083,FA-RPN: Floating Region Proposals for Face Detection,,,,,,,,Wednesday,Poster 2.2,100,Mahyar Najibi,\r\n1084,Multi-Agent Tensor Fusion for Contextual Trajectory Prediction,,,,,,,,Thursday,Poster 3.2,135,Tianyang Zhao,\r\n1085,Bayesian Hierarchical Dynamic Model for Human Action Recognition,,,,,,,,Wednesday,Poster 2.2,101,Rui Zhao,\r\n1087,A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem,,,,,,,,Wednesday,Poster 2.1,188,Ganzhao Yuan,\r\n1088,Recurrent Neural Network for (Un-)Supervised Learning of Monocular Video Visual Odometry and Depth,,,,,,,,Wednesday,Poster 2.1,101,Rui Wang,\r\n1089,F-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning,,,,,,,,Thursday,Poster 3.1,166,Yongqin Xian,\r\n1091,Learning RoI Transformer for Oriented Object Detection in Aerial Images,,,,,,,,Tuesday,Poster 1.2,37,Jian Ding,\r\n1092,Training Deep Learning Based Image Denoisers From Undersampled Measurements Without Ground Truth and Without Image Prior,,,,,,,,Thursday,Poster 3.1,164,Magauiya Zhussip,\r\n1093,Action4D: Online Action Recognition in the Crowd and Clutter,,,,,,,,Thursday,Poster 3.2,90,Quanzeng You,\r\n1094,Structural Relational Reasoning of Point Clouds,,,,,,,,Tuesday,Poster 1.1,84,Yueqi Duan,\r\n1095,Fast Object Class Labelling via Speech,,,,,,,,Wednesday,Poster 2.1,82,Michael Gygli,\r\n1096,Object Instance Annotation With Deep Extreme Level Set Evolution,,,,,,,,Wednesday,Poster 2.2,77,Zian Wang,\r\n1097,Graphical Contrastive Losses for Scene Graph Parsing,,,,,,,,Thursday,Poster 3.2,57,Ji Zhang,\r\n1099,Time-Conditioned Action Anticipation in One Shot,,,,,,,,Thursday,Poster 3.1,130,Qiuhong Ke,\r\n1100,Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation,,,,,,,,Thursday,Poster 3.2,205,Haofu Liao,\r\n1101,Quantization Networks,,,,,,,,Wednesday,Poster 2.2,57,Jiwei Yang,\r\n1102,Topology Reconstruction of Tree-Like Structure in Images via Structural Similarity Measure and Dominant Set Clustering,,,,,,,,Wednesday,Poster 2.2,215,Jianyang Xie,\r\n1103,Perceive Where to Focus: Learning Visibility-Aware Part-Level Features for Partial Person Re-Identification,,,,,,,,Tuesday,Poster 1.1,16,Yifan Sun,\r\n1104,Learning Without Memorizing,,,,,,,,Wednesday,Poster 2.1,59,Prithviraj Dhar,\r\n1106,MMFace: A Multi-Metric Regression Network for Unconstrained Face Reconstruction,,,,,,,,Wednesday,Poster 2.2,94,Hongwei Yi,\r\n1107,Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation,,,,,,,,Thursday,Poster 3.1,114,Andrea Pilzer,\r\n1108,Learning to Localize Through Compressed Binary Maps,,,,,,,,Thursday,Poster 3.1,170,Xinkai Wei,\r\n1109,Min-Max Statistical Alignment for Transfer Learning,,,,,,,,Thursday,Poster 3.1,35,Samitha Herath,\r\n1110,Multi-Step Prediction of Occupancy Grid Maps With Recurrent Neural Networks,,,,,,,,Thursday,Poster 3.1,214,Nima Mohajerin,\r\n1111,Pointing Novel Objects in Image Captioning,,,,,,,,Thursday,Poster 3.2,191,Yehao Li,\r\n1112,Task-Free Continual Learning,,,,,,,,Thursday,Poster 3.2,28,Rahaf Aljundi,\r\n1113,A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks,,,,,,,,Wednesday,Poster 2.2,23,Yinghao Xu,\r\n1114,DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition,,,,,,,,Tuesday,Poster 1.1,128,Zheng Shou,\r\n1115,Low-Rank Laplacian-Uniform Mixed Model for Robust Face Recognition,,,,,,,,Thursday,Poster 3.2,112,Jiayu Dong,\r\n1116,PIEs: Pose Invariant Embeddings,,,,,,,,Thursday,Poster 3.2,179,Chih-Hui Ho,\r\n1117,Discovering Visual Patterns in Art Collections With Spatially-Consistent Feature Learning,,,,,,,,Thursday,Poster 3.1,34,Xi Shen,\r\n1118,Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning,,,,,,,,Thursday,Poster 3.2,190,Nayyer Aafaq,\r\n1119,Marginalized Latent Semantic Encoder for Zero-Shot Learning,,,,,,,,Wednesday,Poster 2.1,196,Zhengming Ding,\r\n1120,Large-Scale Few-Shot Learning: Knowledge Transfer With Class Hierarchy,,,,,,,,Wednesday,Poster 2.2,47,Aoxue Li,\r\n1121,Deep Robust Subjective Visual Property Prediction in Crowdsourcing,,,,,,,,Thursday,Poster 3.1,4,Qianqian Xu,\r\n1122,Soft Labels for Ordinal Regression,,,,,,,,Wednesday,Poster 2.1,18,Raul Diaz,\r\n1123,Learning Words by Drawing Images,,,,,,,,Tuesday,Poster 1.1,205,Didac Suris,\r\n1124,Learning to Separate Multiple Illuminants in a Single Image,,,,,,,,Tuesday,Poster 1.2,149,Zhuo Hui,\r\n1125,Context-Aware Visual Compatibility Prediction,,,,,,,,Thursday,Poster 3.2,203,Guillem Cucurull,\r\n1126,LiveSketch: Query Perturbations for Guided Sketch-Based Visual Search,,,,,,,,Tuesday,Poster 1.2,40,John Collomosse,\r\n1127,Explicit Bias Discovery in Visual Question Answering Models,,,,,,,,Thursday,Poster 3.1,93,Varun Manjunatha,\r\n1128,Single-Frame Regularization for Temporally Stable CNNs,,,,,,,,Thursday,Poster 3.2,20,Gabriel Eilertsen,\r\n1129,Mask-Guided Portrait Editing With Conditional GANs,,,,,,,,Tuesday,Poster 1.2,96,Shuyang Gu,\r\n1130,Large-Scale Interactive Object Segmentation With Human Annotators,,,,,,,,Thursday,Poster 3.2,74,Rodrigo Benenson,\r\n1131,On Exploring Undetermined Relationships for Visual Relationship Detection,,,,,,,,Wednesday,Poster 2.1,58,Yibing Zhan,\r\n1132,A Compact Embedding for Facial Expression Similarity,,,,,,,,Wednesday,Poster 2.1,129,Raviteja Vemulapalli,\r\n1133,Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification,,,,,,,,Tuesday,Poster 1.1,50,Hao Guo,\r\n1134,Object-Driven Text-To-Image Synthesis via Adversarial Training,,,,,,,,Thursday,Poster 3.2,140,Wenbo Li,\r\n1135,Creative Flow+ Dataset,,,,,,,,Wednesday,Poster 2.1,84,Maria Shugrina,\r\n1136,Towards Real Scene Super-Resolution With Raw Images,,,,,,,,Tuesday,Poster 1.1,174,Xiangyu Xu,\r\n1137,Inverse Discriminative Networks for Handwritten Signature Verification,,,,,,,,Wednesday,Poster 2.1,152,Ping Wei,\r\n1138,Compressing Unknown Images With Product Quantizer for Efficient Zero-Shot Classification,,,,,,,,Wednesday,Poster 2.1,92,Jin Li,\r\n1139,Disentangling Latent Hands for Image Synthesis and Pose Estimation,,https://mp.weixin.qq.com/s/U45X5kKMw_E4vfHfVmWZTA,,,,,,Thursday,Poster 3.1,125,Linlin Yang,\r\n1140,Feature Selective Anchor-Free Module for Single-Shot Object Detection,,,,,,,,Tuesday,Poster 1.1,61,Chenchen Zhu,\"Chenchen Zhu, Yihui He, Marios Savvides\"\r\n1141,Rare Event Detection Using Disentangled Representation Learning,,,,,,,,Thursday,Poster 3.1,39,Ryuhei Hamaguchi,\r\n1142,Joint Manifold Diffusion for Combining Predictions on Decoupled Observations,,,,,,,,Wednesday,Poster 2.2,82,Kwang In Kim,\r\n1143,Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences,,,,,,,,Thursday,Poster 3.1,8,Seonguk Seo,\r\n1144,DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios,,,,,,,,Tuesday,Poster 1.1,67,Guorun Yang,\r\n1145,VRSTC: Occlusion-Free Video Person Re-Identification,,,,,,,,Wednesday,Poster 2.2,44,Ruibing Hou,\r\n1146,Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks,,,,,,,,Wednesday,Poster 2.1,190,Xiaolong Jiang,\r\n1147,Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition,,,,,,,,Thursday,Poster 3.2,127,Deepti Ghadiyaram,\r\n1148,Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition,,,,,,,,Tuesday,Poster 1.1,22,Yongming Rao,\r\n1149,Signal-To-Noise Ratio: A Robust Distance Metric for Deep Metric Learning,,,,,,,,Wednesday,Poster 2.1,26,Tongtong Yuan,\r\n1150,All You Need Is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification,,,,,,,,Wednesday,Poster 2.2,50,Weijie Chen,\r\n1151,Deep Transfer Learning for Multiple Class Novelty Detection,,,,,,,,Thursday,Poster 3.2,58,Pramuditha Perera,\r\n1152,Inserting Videos Into Videos,,,,,,,,Thursday,Poster 3.1,144,Donghoon Lee,\r\n1153,Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks,,,,,,,,Tuesday,Poster 1.2,101,Meilu Zhu,\r\n1154,Part-Regularized Near-Duplicate Vehicle Re-Identification,,,,,,,,Tuesday,Poster 1.2,189,Bing He,\r\n1155,Unsupervised Multi-Modal Neural Machine Translation,,,,,,,,Thursday,Poster 3.1,187,Yuanhang Su,\r\n1156,Video Summarization by Learning From Unpaired Data,,,,,,,,Wednesday,Poster 2.2,118,Mrigank Rochan,\r\n1157,Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments,,,,,,,,Thursday,Poster 3.2,178,Xueting Li,\r\n1158,Transferable AutoML by Model Sharing Over Grouped Datasets,,,,,,,,Thursday,Poster 3.1,5,Chao Xue,\r\n1159,When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images,,,,,,,,Tuesday,Poster 1.1,155,Mahmoud Afifi,\r\n1160,Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization,,,,,,,,Tuesday,Poster 1.1,184,Dongze Lian,\r\n1161,Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks,,,,,,,,Tuesday,Poster 1.1,31,Kuan Fang,\r\n1162,VERI-Wild: A Large Dataset and a New Method for Vehicle Re-Identification in the Wild,,,,,,,,Tuesday,Poster 1.2,76,Yihang Lou,\r\n1163,Enhancing TripleGAN for Semi-Supervised Conditional Instance Synthesis and Classification,,,,,,,,Thursday,Poster 3.1,147,Si Wu,\r\n1164,Knowing When to Stop: Evaluation and Verification of Conformity to Output-Size Specifications,,,,,,,,Thursday,Poster 3.2,167,Chenglong Wang,\r\n1165,Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning,,,,,,,,Tuesday,Poster 1.2,109,Xinchen Liu,\r\n1166,LBS Autoencoder: Self-Supervised Fitting of Articulated Meshes to Point Clouds,,,,,,,,Thursday,Poster 3.2,119,Chun-Liang Li,\r\n1167,All-Weather Deep Outdoor Lighting Estimation,,,,,,,,Thursday,Poster 3.1,154,Jinsong Zhang,\r\n1168,Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training,,,,,,,,Wednesday,Poster 2.2,216,Feng Zheng,\r\n1169,Not Using the Car to See the Sidewalk -- Quantifying and Controlling the Effects of Context in Classification and Segmentation,,,,,,,,Wednesday,Poster 2.2,168,Rakshith Shetty,\r\n1170,Iterative Projection and Matching: Finding Structure-Preserving Representatives and Its Application to Computer Vision,,,,,,,,Wednesday,Poster 2.1,87,Alireza Zaeemzadeh,\r\n1171,Hyperspectral Image Reconstruction Using a Deep Spatial-Spectral Prior,,,,,,,,Wednesday,Poster 2.2,149,Lizhi Wang,\r\n1172,Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally,,,,,,,,Thursday,Poster 3.1,121,Xiuyi Jia,\r\n1173,Shape Unicode: A Unified Shape Representation,,,,,,,,Tuesday,Poster 1.2,150,Sanjeev Muralikrishnan,\r\n1174,Co-Occurrent Features in Semantic Segmentation,,,,,,,,Tuesday,Poster 1.1,32,Hang Zhang,\r\n1175,Generalized Zero-Shot Recognition Based on Visually Semantic Embedding,,,,,,,,Tuesday,Poster 1.2,52,Pengkai Zhu,\r\n1176,Foreground-Aware Image Inpainting,,,,,,,,Wednesday,Poster 2.1,160,Wei Xiong,\r\n1177,Greedy Structure Learning of Hierarchical Compositional Models,,,,,,,,Thursday,Poster 3.2,65,Adam Kortylewski,\r\n1178,Practical Coding Function Design for Time-Of-Flight Imaging,,,,,,,,Tuesday,Poster 1.1,158,Felipe Gutierrez-Barragan,\r\n1179,Skin-Based Identification From Multispectral Image Data Using CNNs,,,,,,,,Thursday,Poster 3.2,176,Takeshi Uemori,\r\n1180,Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models,,,,,,,,Wednesday,Poster 2.1,195,Daniel Ritchie,\r\n1181,Cross-Modal Relationship Inference for Grounding Referring Expressions,,,,,,,,Tuesday,Poster 1.2,204,Sibei Yang,\r\n1182,Polynomial Representation for Persistence Diagram,,,,,,,,Wednesday,Poster 2.1,189,Zhichao Wang,\r\n1183,Learning Individual Styles of Conversational Gesture,,,,,,,,Tuesday,Poster 1.2,102,Shiry Ginosar,\r\n1184,Video Magnification in the Wild Using Fractional Anisotropy in Temporal Distribution,,,,,,,,Tuesday,Poster 1.1,163,Shoichiro Takeda,\r\n1185,Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation,,,,,,,,Tuesday,Poster 1.2,66,Chunfeng Song,\r\n1186,SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction,,,,,,,,Thursday,Poster 3.2,131,Pu Zhang,\r\n1187,\"Event Cameras, Contrast Maximization and Reward Functions: An Analysis\",,,,,,,,Thursday,Poster 3.2,171,Timo Stoffregen,\r\n1188,Multi-Level Context Ultra-Aggregation for Stereo Matching,,,,,,,,Tuesday,Poster 1.2,81,Guang-Yu Nie,\r\n1189,Dense Depth Posterior (DDP) From Single Image and Sparse Range,,,,,,,,Tuesday,Poster 1.2,88,Yanchao Yang,\r\n1190,\"Large-Scale, Metric Structure From Motion for Unordered Light Fields\",,,,,,,,Tuesday,Poster 1.2,82,Sotiris Nousias,\r\n1191,LSTA: Long Short-Term Attention for Egocentric Action Recognition,,,,,,,,Thursday,Poster 3.1,133,Swathikiran Sudhakaran,\r\n1192,\"Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More\",,,,,,,,Tuesday,Poster 1.2,17,Jingwen Ye,\r\n1193,Feature Transfer Learning for Face Recognition With Under-Represented Data,,,,,,,,Wednesday,Poster 2.1,131,Xi Yin,\r\n1194,Multi-Task Multi-Sensor Fusion for 3D Object Detection,,,,,,,,Wednesday,Poster 2.2,61,Ming Liang,\r\n1195,DuDoNet: Dual Domain Network for CT Metal Artifact Reduction,,,,,,,,Thursday,Poster 3.1,205,Wei-An Lin,\r\n1196,Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?,,,,,,,,Wednesday,Poster 2.1,37,Shilin Zhu,\r\n1197,Re-Identification Supervised Texture Generation,,,,,,,,Thursday,Poster 3.2,89,Jian Wang,\r\n1198,Deep Spherical Quantization for Image Search,,,,,,,,Thursday,Poster 3.2,73,Sepehr Eghbali,\r\n1199,Predicting Future Frames Using Retrospective Cycle GAN,,,,,,,,Tuesday,Poster 1.1,183,Yong-Hoon Kwon,\r\n1200,Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation,,,,,,,,Tuesday,Poster 1.2,98,Yong Zhang,\r\n1201,MFAS: Multimodal Fusion Architecture Search,,,,,,,,Wednesday,Poster 2.2,4,Juan-Manuel Perez-Rua,\r\n1202,Feature Space Perturbations Yield More Transferable Adversarial Examples,,,,,,,,Wednesday,Poster 2.2,14,Nathan Inkawhich,\r\n1203,A Perceptual Prediction Framework for Self Supervised Event Segmentation,,,,,,,,Tuesday,Poster 1.1,121,Sathyanarayanan N. Aakur,\r\n1204,Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection,,,,,,,,Thursday,Poster 3.2,187,Taekyung Kim,\r\n1205,Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses,,,,,,,,Thursday,Poster 3.1,183,Jing Shi,\r\n1206,Few-Shot Adaptive Faster R-CNN,,,,,,,,Wednesday,Poster 2.2,43,Tao Wang,\r\n1207,Multi-Scale Geometric Consistency Guided Multi-View Stereo,,,,,,,,Wednesday,Poster 2.1,94,Qingshan Xu,\r\n1208,Dense 3D Face Decoding Over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders,,,,,,,,Tuesday,Poster 1.1,99,Yuxiang Zhou,\r\n1209,An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition,,,,,,,,Tuesday,Poster 1.1,124,Chenyang Si,\r\n1210,LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks,,,,,,,,Wednesday,Poster 2.1,35,Sudhakar Kumawat,\r\n1212,Adaptive Weighting Multi-Field-Of-View CNN for Semantic Segmentation in Pathology,,,,,,,,Thursday,Poster 3.2,201,Hiroki Tokunaga,\r\n1213,The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation,,,,,,,,Thursday,Poster 3.2,84,Dylan Campbell,\r\n1214,PEPSI : Fast Image Inpainting With Parallel Decoding Network,,,,,,,,Thursday,Poster 3.2,39,Min-cheol Sagong,\r\n1215,It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning,,,,,,,,Tuesday,Poster 1.1,195,Monica Haurilet,\r\n1216,Co-Saliency Detection via Mask-Guided Fully Convolutional Networks With Multi-Scale Label Smoothing,,,,,,,,Tuesday,Poster 1.2,62,Kaihua Zhang,\r\n1217,Where's Wally Now? Deep Generative and Discriminative Embeddings for Novelty Detection,,,,,,,,Thursday,Poster 3.2,54,Philippe Burlina,\r\n1218,Point-To-Pose Voting Based Hand Pose Estimation Using Residual Permutation Equivariant Layer,,,,,,,,Thursday,Poster 3.2,115,Shile Li,\r\n1219,Context-Reinforced Semantic Segmentation,,,,,,,,Tuesday,Poster 1.2,194,Yizhou Zhou,\r\n1220,BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation,,,,,,,,Tuesday,Poster 1.1,104,Wanhua Li,\r\n1221,DynTypo: Example-Based Dynamic Text Effects Transfer,,,https://menyifang.github.io/projects/DynTypo/DynTypo.html?,,,,,Wednesday,Poster 2.1,163,Yifang Men,Yifang Men Zhouhui Lian Yingmin Tang Jianguo Xiao\r\n1222,StereoDRNet: Dilated Residual StereoNet,,,,,,,,Thursday,Poster 3.2,83,Rohan Chabra,\r\n1223,Analysis of Feature Visibility in Non-Line-Of-Sight Measurements,,,,,,,,Thursday,Poster 3.1,152,Xiaochun Liu,\r\n1225,\"Unsupervised Event-Based Learning of Optical Flow, Depth, and Egomotion\",,,,,,,,Tuesday,Poster 1.1,88,Alex Zihao Zhu,\r\n1226,Sensitive-Sample Fingerprinting of Deep Neural Networks,,,,,,,,Wednesday,Poster 2.1,17,Zecheng He,\r\n1227,Octree Guided CNN With Spherical Kernels for 3D Point Clouds,,,,,,,,Thursday,Poster 3.1,100,Huan Lei,\r\n1228,Assessment of Faster R-CNN in Man-Machine Collaborative Search,,,,,,,,Tuesday,Poster 1.2,71,Arturo Deza,\r\n1229,ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding,,,,,,,,Tuesday,Poster 1.2,75,Ning Liu,\r\n1230,Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables,,,,,,,,Tuesday,Poster 1.2,203,Yan Xu,\r\n1231,L3-Net: Towards Learning Based LiDAR Localization for Autonomous Driving,,,,,,,,Wednesday,Poster 2.1,216,Weixin Lu,\r\n1232,Unsupervised Image Matching and Object Discovery as Optimization,,,,,,,,Wednesday,Poster 2.2,175,Huy V. Vo,\r\n1233,P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification,,,,,,,,Tuesday,Poster 1.1,212,Bingzhe Wu,\r\n1234,Self-Critical N-Step Training for Image Captioning,,,,,,,,Wednesday,Poster 2.1,207,Junlong Gao,\r\n1235,ZigZagNet: Fusing Top-Down and Bottom-Up Context for Object Segmentation,,,,,,,,Wednesday,Poster 2.2,76,Di Lin,\r\n1236,DARNet: Deep Active Ray Network for Building Segmentation,,,,,,,,Wednesday,Poster 2.2,70,Dominic Cheng,\r\n1237,Efficient Multi-Domain Learning by Covariance Normalization,,,,,,,,Wednesday,Poster 2.1,88,Yunsheng Li,\r\n1238,Deep Fitting Degree Scoring Network for Monocular 3D Object Detection,,,,,,,,Tuesday,Poster 1.1,95,Lijie Liu,\r\n1239,Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment,,,,,,,,Thursday,Poster 3.1,107,Pedro Miraldo,\r\n1240,A Robust Local Spectral Descriptor for Matching Non-Rigid Shapes With Incompatible Shape Structures,,,,,,,,Wednesday,Poster 2.1,200,Yiqun Wang,\r\n1241,Local Temporal Bilinear Pooling for Fine-Grained Action Parsing,,,,,,,,Thursday,Poster 3.2,123,Yan Zhang,\r\n1243,Towards Optimal Structured CNN Pruning via Generative Adversarial Learning,,,,,,,,Tuesday,Poster 1.2,13,Shaohui Lin,\r\n1244,Semantically Aligned Bias Reducing Zero Shot Learning,,,,,,,,Wednesday,Poster 2.2,13,Akanksha Paul,\r\n1245,Synthesizing 3D Shapes From Silhouette Image Collections Using Multi-Projection Generative Adversarial Networks,,,,,,,,Wednesday,Poster 2.1,99,Xiao Li,\r\n1246,Temporal Cycle-Consistency Learning,,,https://sites.google.com/view/temporal-cycle-consistency?,,,,,Tuesday,Poster 1.1,182,Debidatta Dwibedi,\"Debidatta Dwibedi, Yusuf Aytar, Jonathan Tompson, Pierre Sermanet, Andrew Zisserman\"\r\n1248,DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama,,,,,,,,Tuesday,Poster 1.2,89,Shang-Ta Yang,\r\n1249,Deep Sketch-Shape Hashing With Segmented 3D Stochastic Viewing,,,,,,,,Tuesday,Poster 1.1,56,Jiaxin Chen,\r\n1250,Joint Representative Selection and Feature Learning: A Semi-Supervised Approach,,,,,,,,Wednesday,Poster 2.1,177,Suchen Wang,\r\n1251,Connecting the Dots: Learning Representations for Active Monocular Depth Estimation,,,,,,,,Wednesday,Poster 2.2,90,Gernot Riegler,\r\n1252,Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification,,,,,,,,Wednesday,Poster 2.1,202,Wen-Hsuan Chu,\r\n1253,An Alternative Deep Feature Approach to Line Level Keyword Spotting,,,,,,,,Thursday,Poster 3.2,207,George Retsinas,\r\n1254,RF-Net: An End-To-End Image Matching Network Based on Receptive Field,,,,,,,,Wednesday,Poster 2.2,159,Xuelun Shen,\r\n1255,Self-Supervised Convolutional Subspace Clustering Network,,,,,,,,Wednesday,Poster 2.1,93,Junjian Zhang,\r\n1256,Generalizing Eye Tracking With Bayesian Adversarial Learning,,,,,,,,Thursday,Poster 3.2,113,Kang Wang,\r\n1257,Robust Video Stabilization by Optimization in CNN Weight Space,,,,,,,,Tuesday,Poster 1.2,151,Jiyang Yu,\r\n1258,GIF2Video: Color Dequantization and Temporal Interpolation of GIF Images,,,,,,,,Tuesday,Poster 1.1,143,Yang Wang,\r\n1259,DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs,,,,,,,,Tuesday,Poster 1.2,47,Erkun Yang,\r\n1260,Multi-Person Articulated Tracking With Spatial and Temporal Embeddings,,,,,,,,Wednesday,Poster 2.1,127,Sheng Jin,\r\n1261,Knowledge Adaptation for Efficient Semantic Segmentation,,,,,,,,Tuesday,Poster 1.1,35,Tong He,\r\n1262,Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds,,,,,,,,Thursday,Poster 3.2,210,Zihao Liu,\r\n1263,Interpreting CNNs via Decision Trees,,,,,,,,Wednesday,Poster 2.1,203,Quanshi Zhang,\r\n1264,Surface Reconstruction From Normals: A Robust DGP-Based Discontinuity Preservation Approach,,,,,,,,Wednesday,Poster 2.1,78,Wuyuan Xie,\r\n1265,IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction,,,,,,,,Wednesday,Poster 2.2,15,Dominic Jack,\r\n1266,Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction,,,,,,,,Wednesday,Poster 2.2,22,Osama Makansi,\r\n1267,MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment,,,,,,,,Tuesday,Poster 1.1,126,Da Zhang,\"Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis\"\r\n1268,End-To-End Learned Random Walker for Seeded Image Segmentation,,,,,,,,Thursday,Poster 3.2,197,Lorenzo Cerrone,\r\n1269,Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning,,,,,,,,Wednesday,Poster 2.2,197,Junchao Zhang,\r\n1270,Variational Convolutional Neural Network Pruning,,,,,,,,Tuesday,Poster 1.2,12,Chenglong Zhao,\r\n1271,Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition,,,,,,,,Tuesday,Poster 1.2,8,Yaping Zhang,\r\n1272,Structured Pruning of Neural Networks With Budget-Aware Regularization,,,,,,,,Thursday,Poster 3.1,16,Carl Lemaire,\r\n1273,Exploring Object Relation in Mean Teacher for Cross-Domain Detection,,,,,,,,Thursday,Poster 3.2,49,Qi Cai,\r\n1274,ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape,,,,,,,,Tuesday,Poster 1.1,209,Fabian Manhardt,\r\n1275,Douglas-Rachford Networks: Learning Both the Image Prior and Data Fidelity Terms for Blind Image Deconvolution,,,,,,,,Thursday,Poster 3.1,162,Raied Aljadaany,\r\n1276,Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation,,,,,,,,Thursday,Poster 3.2,42,Fengda Zhu,\r\n1278,Orthogonal Decomposition Network for Pixel-Wise Binary Classification,,,,,,,,Wednesday,Poster 2.1,183,Chang Liu,\r\n1279,Blind Image Deblurring With Local Maximum Gradient Prior,,,,,,,,Tuesday,Poster 1.1,176,Liang Chen,\r\n1280,Emotion-Aware Human Attention Prediction,,,,,,,,Tuesday,Poster 1.2,192,Macario O. Cordel II,\r\n1281,Spatio-Temporal Video Re-Localization by Warp LSTM,,,,,,,,Tuesday,Poster 1.1,130,Yang Feng,\r\n1282,Deep Rigid Instance Scene Flow,,,,,,,,Tuesday,Poster 1.2,114,Wei-Chiu Ma,\r\n1283,Exploring Context and Visual Pattern of Relationship for Scene Graph Generation,,,,,,,,Wednesday,Poster 2.2,165,Wenbin Wang,\r\n1284,Normalized Diversification,,,,,,,,Thursday,Poster 3.1,169,Shaohui Liu,\r\n1285,SketchGAN: Joint Sketch Completion and Recognition With Generative Adversarial Network,,,,,,,,Wednesday,Poster 2.1,159,Fang Liu,\r\n1286,Adaptive Transfer Network for Cross-Domain Person Re-Identification,,,,,,,,Wednesday,Poster 2.2,46,Jiawei Liu,\r\n1287,Uncertainty Guided Multi-Scale Residual Learning-Using a Cycle Spinning CNN for Single Image De-Raining,,,,,,,,Wednesday,Poster 2.2,205,Rajeev Yasarla,\r\n1288,Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction,,,,,,,,Tuesday,Poster 1.1,179,Yifei Shi,\r\n1289,Retrieval-Augmented Convolutional Neural Networks Against Adversarial Examples,,,,,,,,Thursday,Poster 3.2,60,Jake Zhao (Junbo),\r\n1290,Visual Query Answering by Entity-Attribute Graph Matching and Reasoning,,,,,,,,Wednesday,Poster 2.2,200,Peixi Xiong,\r\n1292,Monocular Depth Estimation Using Relative Depth Maps,,,,,,,,Thursday,Poster 3.1,110,Jae-Han Lee,\r\n1293,Variational Bayesian Dropout With a Hierarchical Prior,,,,,,,,Wednesday,Poster 2.2,20,Yuhang Liu,\r\n" }, { "path": "cvpr_2019_oral.csv", "content": "Paper ID,Paper Title,Author Names,Primary Subject Area,Secondary Subject Areas,Status,Day.Time.Track,Group,Topic,,,Track 1,Track 2,Track 3\n5,Finding Task-Relevant Features for Few-Shot Learning by Category Traversal,\"Hongyang Li (The Chinese University of Hong Kong)*; David Eigen (Clarifai Inc.); Samuel Dodge (Clarifai Inc.); Matt Zeiler (Clarifai Inc.); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",Deep Learning,,Oral,1.1.1,1,Deep Learning,,Orals 1.1,Deep Learning,3D Multiview,Action & Video\n6340,Edge-Labeling Graph Neural Network for Few-shot Learning,Jongmin Kim (KAIST)*; Taesup Kim (Université de Montréal); Sungwoong Kim (Kakao Brain); Chang D. Yoo (KAIST),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning; Segmentation, Grouping a\",Oral,1.1.1,1,,,Orals 1.2,Recognition,Synthesis,Scenes & Representation\n5728,Generating Classification Weights with Graph Neural Networks for Few-Shot Learning,\"Spyros Gidaris (valeo.ai)*; Nikos Komodakis (\"\"ENPC, France\"\")\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,1.1.1,1,,,Orals 2.1,Deep Learning,3D Single View & RGBD,Motion & Biometrics\n257,Kervolutional Neural Networks,\"Chen Wang (Nanyang Technological University)*; JIANFEI YANG (Nanyang Technological University); Prof. Dr. Respected Colleauge (IJCAS Editorial Member); Junsong Yuan (\"\"State University of New York at Buffalo, USA\"\")\",Deep Learning,Computer Vision Theory,Oral,1.1.1,2,,,Orals 2.2,Recognition,Language & Reasoning,Comp. Photography & Graphics\n4863,Why ReLu networks yield high-confidence predictions far away from the training data and how to mitigate the problem,Matthias Hein (University of Tuebingen)*; Maksym Andriushchenko (Saarland University); Julian Bitterwolf (University of Tuebingen),Deep Learning,Statistical Learning,Oral,1.1.1,2,,,Orals 3.1,Applications,\"Learning, Physics, Theory, & Datasets\",Segmentation & Grouping\n6679,On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions,Yusuke Tsuzuku (The University of Tokyo/RIKEN)*; Issei Sato (The university of Tokyo/RIKEN),Deep Learning,Others,Oral,1.1.1,2,,,Orals 3.2,Deep Learning,Face & Body,Low-level & Optimization\n948,Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization,Siyuan Qiao (Johns Hopkins University)*; Zhe Lin (Adobe Research); Jianming Zhang (Adobe Research); Alan Yuille (Johns Hopkins University),Deep Learning,,Oral,1.1.1,3,,,,,,\n2284,Hardness-Aware Deep Metric Learning,Wenzhao Zheng (Tsinghua University); Zhaodong Chen (Tsinghua University); Jiwen Lu (Tsinghua University)*; Jie Zhou (Tsinghua University),Deep Learning,Representation Learning,Oral,1.1.1,3,,,,,,\n1183,Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation,Chenxi Liu (Johns Hopkins University)*; Liang-Chieh Chen (Google Inc.); Florian Schroff (Google Inc.); Hartwig Adam (Google); Wei Hua (Google); Alan Yuille (Johns Hopkins University); Li Fei-Fei (Stanford University),Deep Learning,\"Segmentation, Grouping and Shape\",Oral,1.1.1,3,,,,,,\n1535,Learning to Learn Loss for Active Learning,Donggeun Yoo (Lunit)*; In So Kweon (KAIST),Deep Learning,\"Others; Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Oral,1.1.1,4,,,,,,\n2230,Striking the Right Balance with Uncertainty,Salman Khan (Australian National University (ANU)); Munawar Hayat (University of Canberra); Waqas Zamir (IIAI); Jianbing Shen (Beijing Institute of Technology)*; Ling Shao (Inception Institute of Artificial Intelligence),Deep Learning,Low-level Vision,Oral,1.1.1,4,,,,,,\n2368,AutoAugment: Learning Augmentation Strategies from Data,Ekin D Cubuk (Google Brain)*; Barret Zoph (Google); Dandelion Mane (Protocol Labs); Vijay Vasudevan (Google Brain); Quoc Le (Google Brain),Deep Learning,,Oral,1.1.1,4,,,,,,\n494,SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration without Correspondences,Huu Minh Le (Queensland University of Technology)*; Thanh-Toan Do (The University of Liverpool); Tuan NA Hoang (Singapore University of Technology and Design); Ngai-Man Cheung (Singapore University of Technology and Design),3D from Multiview and Sensors,,Oral,1.1.2,1,3D Multiview,,,,,\n2315,BAD SLAM: Bundle Adjusted Direct RGB-D SLAM,Thomas Schöps (ETH Zurich)*; Torsten Sattler (Chalmers University of Technology); Marc Pollefeys (ETH Zurich / Microsoft),3D from Multiview and Sensors,Datasets and Evaluation; RGBD sensors and analytics,Oral,1.1.2,1,,,,,,\n2286,Revealing Scenes by Inverting Structure from Motion Reconstructions,Francesco Pittaluga (University of Florida)*; Sanjeev J Koppal (University of Florida); Sing Bing Kang (Microsoft Research); Sudipta Sinha (Microsoft Research),3D from Multiview and Sensors,Deep Learning ; Image and Video Synthesis,Oral,1.1.2,1,,,,,,\n1185,Strand-accurate Multi-view Hair Capture,Giljoo Nam (KAIST)*; Chenglei Wu (Facebook Reality Labs); Min H. Kim (KAIST); Yaser Sheikh (Facebook Reality Labs),3D from Multiview and Sensors,Vision + Graphics,Oral,1.1.2,2,,,,,,\n6756,DeepSDF: Learning Continuous Signed Distance Functionsfor Shape Representation,Jeong Joon Park (University of Washington)*; Peter R Florence (MIT); Julian Straub (Facebook Reality Labs); Richard Newcombe (Facebook); Steven Lovegrove (Facebook),Deep Learning,3D from Multiview and Sensors; Representation Learning; Vision + Graphics,Oral,1.1.2,2,,,,,,\n2957,Pushing the Boundaries of View Extrapolation with Multiplane Images,Pratul Srinivasan (UC Berkeley)*; Richard Tucker (Google); Jonathan T Barron (Google Research); Ravi Ramamoorthi (University of California San Diego); Ren Ng (UC Berkeley); Noah Snavely (Cornell University and Google AI),3D from Multiview and Sensors,Computational Photography; Image and Video Synthesis; Vision + Graphics,Oral,1.1.2,2,,,,,,\n1935,GA-Net: Guided Aggregation Net for End-to-end Stereo Matching,Feihu Zhang (University of Oxford)*; Victor Adrian Prisacariu (University of Oxford); Yang Ruigang (Baidu); Philip Torr (University of Oxford),3D from Multiview and Sensors,Deep Learning,Oral,1.1.2,3,,,,,,\n2901,Real-time self-adaptive deep stereo,Alessio Tonioni (University of Bologna); Fabio Tosi (University of Bologna); Matteo Poggi (University of Bologna)*; Stefano Mattoccia (University of Bologna); Luigi Di Stefano (University of Bologna),3D from Multiview and Sensors,Deep Learning ; Low-level Vision,Oral,1.1.2,3,,,,,,\n6639,LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation,Sunok Kim (Yonsei University); Seungryong Kim (Yonsei University); Dongbo Min (Ewha Womans University); Kwanghoon Sohn (Yonsei Univ.)*,3D from Multiview and Sensors,Others,Oral,1.1.2,3,,,,,,\n3522,NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences,Chen Zhao (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.); chi li (Huazhong University of Science and Technology); Xin Li (West Virginia University); Jiaqi Yang (Huazhong Univ. of Sci.&Tech.)*,3D from Multiview and Sensors,Representation Learning,Oral,1.1.2,4,,,,,,\n5852,Coordinate-Free Carlsson-Weinshall Duality and Relative Multi-View Geometry,Matthew Trager (NYU)*; Martial Hebert (Carnegie Mellon University); Jean Ponce (Inria),3D from Multiview and Sensors,Computer Vision Theory,Oral,1.1.2,4,,,,,,\n1944,Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image,\"Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen))*; Zhaoxuan Zhang (Dalian University of Technology, Shenzhen Research Institute of Big Data); Dong Du (University of Science and Technology of China, Shenzhen Research Institute of Big Data); Mingdai Yang (Chinese University of Hong Kong, Shenzhen); Jingming Yu (Alibaba); Pan Pan (Alibaba Group); Xin Yang (Dalian University of Technology); Ligang Liu (University of Science and Technology of China); Zixiang Xiong (Texas A&M University); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen )\",3D from Multiview and Sensors,Deep Learning,Oral,1.1.2,4,,,,,,\n292,Video Action Transformer Network,Rohit Girdhar (Carnegie Mellon University)*; Joao Carreira (DeepMind); Carl Doersch (DeepMind); Andrew Zisserman (University of Oxford),Action Recognition,Deep Learning,Oral,1.1.3,1,Action & Video,,,,,\n302,Timeception for Complex Action Recognition,Noureldien Hussein (University of Amsterdam)*; Stratis Gavves (University of Amsterdam); Arnold W.M. Smeulders (University of Amsterdam),Action Recognition,Video Analytics,Oral,1.1.3,1,,,,,,\n1670,STEP: Spatio-Temporal Progressive Learning for Video Action Detection,Xitong Yang (University of Maryland)*; Xiaodong Yang (NVIDIA Research); Ming-Yu Liu (NVIDIA); Fanyi Xiao (University of California Davis); Larry Davis (University of Maryland); Jan Kautz (NVIDIA),Video Analytics,,Oral,1.1.3,1,,,,,,\n1745,Relational Action Forecasting,Chen Sun (Google)*; Abhinav Shrivastava (University of Maryland); Carl Vondrick (Columbia University); Rahul Sukthankar (Google); Kevin Murphy (Google); Cordelia Schmid (Google),Action Recognition,Video Analytics,Oral,1.1.3,2,,,,,,\n2310,Long-Term Feature Banks for Detailed Video Understanding,Chao-Yuan Wu (UT Austin)*; Christoph Feichtenhofer (Facebook AI Research); Haoqi Fan (Facebook AI Research); Kaiming He (Facebook AI Research); Philipp Kraehenbuehl (UT Austin); Ross Girshick (FAIR),Action Recognition,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding;\",Oral,1.1.3,2,,,,,,\n229,Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes,Yuke Li (York University)*,Motion and Tracking,Video Analytics,Oral,1.1.3,2,,,,,,\n3106,What and How Well You Performed? A Multitask Approach To Action Quality Assessment,Paritosh Parmar (UNLV)*; Brendan Morris (UNLV),\"Face, Gesture, and Body Pose\",Action Recognition ; Datasets and Evaluation; Video Analytics,Oral,1.1.3,3,,,,,,\n1382,MHP-VOS: Video Object Segmentation with Multiple Hypotheses Propagation,Shuangjie Xu (Huazhong University of Science and Technology); Daizong Liu (Huazhong University of Science and Technology); Linchao Bao (Tencent AI Lab)*; Wei Liu (Tencent); Pan Zhou ( Huazhong University of Science and Technology),Video Analytics,\"Motion and Tracking; Segmentation, Grouping and Shape\",Oral,1.1.3,3,,,,,,\n1517,2.5D Visual Sound,Ruohan Gao (University of Texas at Austin)*; Kristen Grauman (Facebook AI Research & UT Austin),Video Analytics,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Oral,1.1.3,3,,,,,,\n1999,Language-driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model,\"Weining Wang (Institute of Automation, Chinese Academy of Sciences)*; Yan Huang (Institute of Automation, Chinese Academy of Sciences); Liang Wang (NLPR, China)\",Video Analytics,Vision + Language,Oral,1.1.3,4,,,,,,\n5591,Gaussian Temporal Awareness Networks for Action Localization,Fuchen Long (University of Science and Technology of China); Ting Yao (JD AI Research)*; Zhaofan Qiu (University of Science and Technology of China); Xinmei Tian (USTC); Jiebo Luo (U. Rochester); Tao Mei (AI Research of JD.com),Video Analytics,Action Recognition,Oral,1.1.3,4,,,,,,\n6940,Efficient Video Classification Using Fewer Frames,\"Shweta Bhardwaj (Indian Institute of Technology Madras, Chennai)*; Mukundhan Srinivasan (NVIDIA); Mitesh M. Khapra (Indian Institute of Technology Madras)\",Video Analytics,Deep Learning ; Vision Applications and Systems,Oral,1.1.3,4,,,,,,\n16,Joint Discriminative and Generative Learning for Person Re-identification,\"Zhedong Zheng (University of Technology Sydney)*; Xiaodong Yang (NVIDIA Research); Zhiding Yu (NVIDIA); Liang Zheng (Australian National University); Yi Yang (University of Technology, Sydney); Jan Kautz (NVIDIA)\",\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose ; Representation Learning\",Oral,1.2.1,1,Recognition,,,,,\n522,Unsupervised Person Re-identification by Soft Multilabel Learning,\"Hong-Xing Yu (Sun Yat-Sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)*; Ancong Wu (Sun Yat-sen University); Xiaowei Guo (Tencent Youtu Lab); Shaogang Gong (Queen Mary University of London); Jian-Huang Lai (Sun Yat-sen University)\",Biometrics,\"Recognition: Detection, Categorization, Retrieval; Video Analytics\",Oral,1.2.1,1,,,,,,\n2262,Learning Context Graph for Person Search,Yichao Yan (Shanghai Jiao Tong University)*; Qiang Zhang (Shanghai Jiao Tong University); Bingbing Ni (Shanghai Jiao Tong University); Wendong Zhang (Shanghai Jiao Tong University); Minghao Xu (Shanghai Jiaotong University); Xiaokang Yang (Shanghai Jiao Tong University of China),\"Recognition: Detection, Categorization, Retrieval\",Visual Reasoning,Oral,1.2.1,1,,,,,,\n220,Gradient Matching Generative Networks for Zero-Shot Learning,Mert Bulent Sariyildiz (Bilkent University)*; Ramazan Gokberk Cinbis (METU),\"Recognition: Detection, Categorization, Retrieval\",Vision + Language,Oral,1.2.1,2,,,,,,\n4499,Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval,\"Sounak Dey (Computer Vision Center)*; Pau Riba (Computer Vision Center); Anjan Dutta (Computer Vision Center); Josep Llados (\"\"Computer Vision Center, Barcelona\"\"); Yi-Zhe Song (Queen Mary University of London)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Vision + Graphics ; Vision Applications and Systems,Oral,1.2.1,2,,,,,,\n5230,Zero-Shot Task Transfer,\"Arghya Pal ( Indian Institute of Technology Hyderabad)*; Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad)\",Representation Learning,Computer Vision Theory; Deep Learning ; Optimization Methods; Vision + Graphics ; Vision Application,Oral,1.2.1,2,,,,,,\n906,C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection,\"Fang Wan (University of Chinese Academy of Sciences)*; Chang Liu (University of Chinese Academy of Sciences); Wei Ke (University of Chinese Academy of Sciences); Xiangyang Ji (Tsinghua University); Jianbin Jiao (University of Chinese Academy of Sciences); Qixiang Ye (University of Chinese Academy of Sciences, China)\",\"Recognition: Detection, Categorization, Retrieval\",Optimization Methods; Others; Statistical Learning; Visual Reasoning,Oral,1.2.1,3,,,,,,\n2973,Learning Inter-pixel Relations for Weakly Supervised Instance Segmentation,Jiwoon Ahn (DGIST); Sunghyun Cho (DGIST); Suha Kwak (POSTECH)*,\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Oral,1.2.1,3,,,,,,\n3916,Attention-based Dropout Layer for Weakly Supervised Object Localization,Junsuk Choe (Yonsei University); Hyunjung Shim (Yonsei University)*,\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Scene Analysis and Understanding,Oral,1.2.1,3,,,,,,\n1019,Domain Generalization by Solving Jigsaw Puzzles,Fabio M. Carlucci (Huawei); Antonio D'Innocente (Sapienza Università di Roma); Silvia Bucci (Italian Institute of Technology); Barbara Caputo (IIT); Tatiana Tommasi (Italian Institute of Technology)*,\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2.1,4,,,,,,\n3628,Transferrable Prototypical Networks for Unsupervised Domain Adaptation,Yingwei Pan (JD AI Research)*; Ting Yao (JD AI Research); Yehao Li (Sun Yat-Sen University); Yu Wang (JD AI Research); Chong-Wah Ngo (City University of Hong Kong); Tao Mei (AI Research of JD.com),\"Recognition: Detection, Categorization, Retrieval\",,Oral,1.2.1,4,,,,,,\n1182,Adversarial Meta-Adaptation Network for Blending-target Domain Adaptation,Ziliang Chen (Sun Yat-sen University)*; Jingyu Zhuang (Sun Yat-sen University); Xiaodan Liang (Sun Yat-sen University); Liang Lin (Sun Yat-sen University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2.1,4,,,,,,\n1113,ELASTIC: Improving CNNs by Instance Specific Scaling Policies,\"Huiyu Wang (Johns Hopkins University)*; Aniruddha Kembhavi (Allen Institute for Artificial Intelligence); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Alan Yuille (Johns Hopkins University); Mohammad Rastegari (Allen Institute for Artificial Intelligence)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2.1,5,,,,,,\n1782,ScratchDet: Training Single-Shot Object Detectors from Scratch,\"Rui Zhu (JD AI Research)*; Shifeng Zhang (CBSR, NLPR, CASIA); Xiaobo Wang (JD AI Research); Longyin Wen (JD Digits); Hailin Shi (JD AI Research); Liefeng Bo (JD Finance); Tao Mei (AI Research of JD.com)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2.1,5,,,,,,\n3294,SFNet: Learning Object-aware Semantic Correspondence,Junghyup Lee (Yonsei University); DOHYUNG KIM (YONSEI UNIVERSITY); Jean Ponce (Inria); Bumsub Ham (Yonsei University)*,\"Recognition: Detection, Categorization, Retrieval\",Motion and Tracking; Scene Analysis and Understanding,Oral,1.2.1,5,,,,,,\n1294,Deep Metric Learning Beyond Binary Supervision,Sungyoun Kim (POSTECH); Minkyo Seo (POSTECH); Ivan Laptev (INRIA Paris); Minsu Cho (POSTECH); Suha Kwak (POSTECH)*,\"Recognition: Detection, Categorization, Retrieval\",,Oral,1.2.1,6,,,,,,\n1510,Learning to Cluster Faces on an Affinity Graph,Lei Yang (The Chinese University of Hong Kong)*; Xiaohang Zhan (The Chinese University of Hong Kong); Dapeng Chen (Sensetime Group Limited); Junjie Yan (Sensetime Group Limited); Chen Change Loy (Nanyang Technological University); Dahua Lin (The Chinese University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",\"Deep Learning ; Face, Gesture, and Body Pose ; Segmentation, Grouping and Shape\",Oral,1.2.1,6,,,,,,\n1610,C2AE: Class Conditioned Auto-Encoder for Open-set Recognition,Poojan B Oza (Johns Hopkins University)*; Vishal Patel (Johns Hopkins University),\"Recognition: Detection, Categorization, Retrieval\",Vision Applications and Systems,Oral,1.2.1,6,,,,,,\n1426,Shapes and Context: In-the-wild Image Synthesis & Manipulation,Aayush Bansal (Carnegie Mellon University)*; Yaser Sheikh (CMU); Deva Ramanan (Carnegie Mellon University),Image and Video Synthesis,\"Big Data, Large Scale Methods ; Segmentation, Grouping and Shape; Vision + Graphics\",Oral,1.2.2,1,Synthesis,,,,,\n462,Semantics Disentangling for Text-to-Image Generation,\"Guojun Yin (University of Science and Technology of China); Bin Liu (University of Science and Technology of China); Lu Sheng (The Chinese University of Hong Kong)*; Nenghai Yu (University of Science and Technology of China); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Jing Shao (Sensetime)\",Image and Video Synthesis,Vision + Language,Oral,1.2.2,1,,,,,,\n2072,Semantic Image Synthesis with Spatially-Adaptive Normalization,Taesung Park (UC Berkeley)*; Ming-Yu Liu (NVIDIA); Ting-Chun Wang (NVIDIA); Jun-Yan Zhu (MIT),Image and Video Synthesis,Computational Photography; Deep Learning ; Vision + Graphics,Oral,1.2.2,1,,,,,,\n609,Progressive Pose Attention Transfer for Person Image Generation,Zhen Zhu (Huazhong University of Science and Technology)*; Tengteng Huang (Huazhong University of Science and Technology); Baoguang Shi (Microsoft); Miao Yu (Huazhong University of Science and Technology); Bofei Wang (ZTE Corporation); Xiang Bai (Huazhong University of Science and Technology),Image and Video Synthesis,,Oral,1.2.2,2,,,,,,\n3269,Unsupervised Person Image Generation with Semantic Parsing Transformation,Sijie Song (Peking University)*; Wei Zhang (JD AI Research); Jiaying Liu (Peking University); Tao Mei (AI Research of JD.com),Image and Video Synthesis,Vision Applications and Systems,Oral,1.2.2,2,,,,,,\n2439,DeepView: View synthesis with learned gradient descent,John P Flynn (Google Inc)*; Michael Broxton (Google); Paul E Debevec (Google VR); Graham Fyffe (Google Inc.); Ryan S. Overbeck (Google Inc.); Noah Snavely (Google); Richard Tucker (Google); Matthew DuVall (Google),Image and Video Synthesis,3D from Multiview and Sensors; Computational Photography; Deep Learning ; Optimization Methods,Oral,1.2.2,2,,,,,,\n4908,Animating Arbitrary Objects via Deep Motion Transfer,Aliaksandr Siarohin (University of Trento)*; Stéphane Lathuiliere (university of Trento); Sergey Tulyakov (Snap Inc); Elisa Ricci (FBK - Technologies of Vision); Nicu Sebe (University of Trento),Image and Video Synthesis,Deep Learning,Oral,1.2.2,3,,,,,,\n5428,Textured Neural Avatars,Aliaksandra Shysheya (Samsung); Egor Zakharov (Skoltech); Renat Bashirov (Samsung); Igor Pasechnik (Samsung); Egor Burkov (Skoltech); Dmitry Ulyanov (Skoltech); Yury Malkov (Samsung); Karim Iskakov (Samsung); Kara-Ali Aliev (Samsung); Alexey Ivakhnenko (Samsung); Alexander Vakhitov (Samsung AI Research Center); Victor Lempitsky (Samsung)*,Image and Video Synthesis,Deep Learning ; Vision + Graphics ; Vision Applications and Systems,Oral,1.2.2,3,,,,,,\n3190,IM-Net for High Resolution Video Frame Interpolation,Tomer Peleg (Samsung Israel R&D Center)*; Pablo Szekely (Samsung Israel R&D Center); Doron Sabo (Samsung Israel R&D Center); Omry Sendik (Samsung Israel R&D Center),Image and Video Synthesis,Datasets and Evaluation; Deep Learning ; Low-level Vision; Motion and Tracking; Vision Applications,Oral,1.2.2,3,,,,,,\n1240,Homomorphic Latent Space Interpolation for Unpaired Image-to-image Translation,Yingcong Chen (Chinese University of Hong Kong)*; Xiaogang XU (The Chinese University of Hong Kong); Zhuotao Tian (Chinese University of Hong Kong); Jiaya Jia (Chinese University of Hong Kong),Image and Video Synthesis,\"Face, Gesture, and Body Pose ; Vision Applications and Systems\",Oral,1.2.2,4,,,,,,\n3069,Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation,Hao Tang (University of Trento)*; Dan Xu (University of Oxford); Yan Yan (Texas State University); Yanzhi Wang (Northeastern University); Jason J Corso (University of Michigan); Nicu Sebe (University of Trento),Image and Video Synthesis,,Oral,1.2.2,4,,,,,,\n4341,Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping,\"Huan Fu (The University of Sydney)*; Mingming Gong (University of Pittsburgh); Chaohui Wang (Laboratoire d'Informatique Gaspard Monge, Université Paris-Est); Kayhan Batmanghelich (University of Pittsburgh / Carnegie Mellon University); Kun Zhang (Carnegie Mellon University); Dacheng Tao (University of Sydney)\",Image and Video Synthesis,Deep Learning,Oral,1.2.2,4,,,,,,\n3521,DeepVoxels: Learning Persistent 3D Feature Embeddings,Vincent Sitzmann (Stanford University)*; Justus Thies (Technical University of Munich); Felix Heide (Princeton University); Matthias Niessner (Technical University of Munich); Gordon Wetzstein (Stanford University); Michael Zollhoefer (Stanford University),Image and Video Synthesis,Deep Learning,Oral,1.2.2,5,,,,,,\n5944,Inverse Path Tracing for Joint Material and Lighting Estimation,Dejan Azinovic (Technical University of Munich)*; Tzu-Mao Li (MIT CSAIL); Matthias Niessner (Technical University of Munich); Anton Kaplanyan (Facebook Reality Labs),Image and Video Synthesis,Computational Photography; Computer Vision Theory; Optimization Methods; Scene Analysis and Understa,Oral,1.2.2,5,,,,,,\n4057,The Visual Centrifuge: Model-Free Layered Video Representations,Jean-Baptiste Alayrac (DeepMind); Joao Carreira (DeepMind)*; Andrew Zisserman (University of Oxford),Image and Video Synthesis,Computational Photography; Deep Learning ; Representation Learning; Scene Analysis and Understanding,Oral,1.2.2,5,,,,,,\n5720,Label-Noise Robust Generative Adversarial Networks,Takuhiro Kaneko (The University of Tokyo)*; Yoshitaka Ushiku (The University of Tokyo); Tatsuya Harada (The University of Tokyo),Image and Video Synthesis,Deep Learning ; Representation Learning,Oral,1.2.2,6,,,,,,\n5766,DLOW: Domain Flow for Adaptation and Generalization,Wen Li (ETH Zurich)*; RUI GONG (ETH Zurich); Yuhua Chen (ETH Zurich); Luc Van Gool (ETH Zurich),Image and Video Synthesis,Deep Learning ; Scene Analysis and Understanding,Oral,1.2.2,6,,,,,,\n6970,CollaGAN: Collaborative GAN for Missing Image Data Imputation,\"Dongwook Lee (Korea Advanced Institute of Science and Technology)*; Junyoung Kim (Korea Advanced Institute of Science and Technology); Won-Jin Moon (Konkuk University Medical Center); Jong Chul Ye (\"\"Department of Bio and Brain Engineering, KAIST, Korea\"\")\",Image and Video Synthesis,\"Deep Learning ; Face, Gesture, and Body Pose ; Medical, Biological and Cell Microscopy\",Oral,1.2.2,6,,,,,,\n6592,d-SNE: Domain Adaptation using Stochastic Neighborhood Embedding,Xiang Xu (University of Houston); Xiong Zhou (amazon); Ragav Venkatesan (Amazon)*; Orchid Majumder (Amazon); Guru Swaminathan (Amazon),\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Oral,1.2.3,1,Scenes and Representation,,,,,\n197,Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation,\"Yawei Luo (University of Technology Sydney)*; Liang Zheng (Australian National University); Tao Guan (Huazhong University of Science and Technology); Junqing Yu (Huazhong University of Science & Technology); Yi Yang (University of Technology, Sydney)\",\"Segmentation, Grouping and Shape\",Deep Learning,Oral,1.2.3,1,,,,,,\n396,ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation,Tuan-Hung VU (Valeo.ai)*; Himalaya Jain (Valeo.ai); Maxime Bucher (Valeo.ai); Matthieu Cord (Sorbonne University); Patrick Pérez (Valeo.ai),Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Oral,1.2.3,1,,,,,,\n325,Local Feature Augmentation with Cross-Modality Context,Zixin Luo (HKUST)*; Tianwei Shen (HKUST); Lei Zhou (HKUST); Jiahui Zhang (Tsinghua University); Yao Yao (The Hong Kong University of Science and Technology); Shiwei Li (HKUST); Tian Fang (HKUST); Long Quan (Hong Kong University of Science and Technology),Representation Learning,Low-level Vision,Oral,1.2.3,2,,,,,,\n556,Large-scale Long-Tailed Recognition in an Open World,Ziwei Liu (The Chinese University of Hong Kong)*; Zhongqi Miao (UC Berkeley); Xiaohang Zhan (The Chinese University of Hong Kong); Jiayun Wang (UC Berkeley / ICSI); Boqing Gong (Tencent AI Lab); Stella X Yu (UC Berkeley / ICSI),Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,1.2.3,2,,,,,,\n5137,AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data,Liheng Zhang (University of Central Florida); Guo-Jun Qi (Huawei Cloud)*; Liqiang Wang (University of Central Florida); Jiebo Luo (University of Rochester),Representation Learning,Deep Learning,Oral,1.2.3,2,,,,,,\n576,SDC - Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks,René Schuster (DFKI)*; Oliver Wasenmüller (DFKI); Christian Unger (BMW); Didier Stricker (DFKI),Representation Learning,Deep Learning ; Motion and Tracking; Robotics + Driving; Scene Analysis and Understanding; Vision Ap,Oral,1.2.3,3,,,,,,\n2746,Learning Correspondence from the Cycle-consistency of Time,Xiaolong Wang (CMU)*; Allan Jabri (UC Berkeley); Alexei A Efros (UC Berkeley),Representation Learning,Motion and Tracking; Video Analytics,Oral,1.2.3,3,,,,,,\n2131,AE^2-Nets: Autoencoder in Autoencoder Networks,Changqing Zhang (Tianjin university)*; liu yeqing (Tianjin University ); Huazhu Fu (Inception Institute of Artificial Intelligence),Representation Learning,Statistical Learning,Oral,1.2.3,3,,,,,,\n1655,Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach,Proteek Roy (Michigan State University); Vishnu Boddeti (Michigan State University)*,Representation Learning,Optimization Methods; Statistical Learning,Oral,1.2.3,4,,,,,,\n3877,Learning Spatial Common Sense with Geometry-Aware Recurrent Networks,Hsiao-Yu Tung (Carnegie Mellon University)*; Ricson Cheng (Carnegie Mellon University); Katerina Fragkiadaki (Carnegie Mellon University),Representation Learning,\"3D from Single Image; Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understa\",Oral,1.2.3,4,,,,,,\n3147,Structured Knowledge Distillation for Semantic Segmentation,\"Yifan Liu (University of Adelaide); Ke Chen (Microsoft); Chris Liu (Microsoft); Zengchang Qin (Intelligent Computing & Machine Learning Lab, School of ASEE, Beihang University); Zhenbo Luo ( Samsung Research Institute China-Beijing); Jingdong Wang (Microsoft Research)*\",\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Oral,1.2.3,4,,,,,,\n977,Scan2CAD: Learning CAD Model Alignment in RGB-D Scans,Armen Avetisyan (Technical University of Munich)*; Manuel Dahnert (Technical University of Munich); Angela Dai (Technical University of Munich); Manolis Savva (Simon Fraser University); Angel X Chang (Eloquent Labs); Matthias Niessner (Technical University of Munich),Scene Analysis and Understanding,\"Recognition: Detection, Categorization, Retrieval; Vision + Graphics\",Oral,1.2.3,5,,,,,,\n2799,Towards Scene Understanding: Unsupervised Monocular Depth Estimation with Semantic-aware Representation,Po-Yi Chen (National Taiwan University); Alexander H. Liu (National Taiwan University); Yen-Cheng Liu (Georgia Institute of Technology); Yu-Chiang Frank Wang (National Taiwan University)*,Scene Analysis and Understanding,\"3D from Single Image; Representation Learning; Robotics + Driving; Segmentation, Grouping and Shape\",Oral,1.2.3,5,,,,,,\n3107,Tell Me Where I Am: Object-level Scene Context Prediction,Xiaotian Qiao (City University of Hong Kong); Quanlong Zheng (City University of HongKong); Ying Cao (City University of Hong Kong)*; Rynson W.H. Lau (City University of Hong Kong),Scene Analysis and Understanding,Image and Video Synthesis,Oral,1.2.3,5,,,,,,\n1373,Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation,He Wang (Stanford University); Srinath Sridhar (Stanford University)*; Jingwei Huang (Stanford University); Julien Valentin (Google); Shuran Song (Princeton); Leonidas Guibas (Stanford University),Scene Analysis and Understanding,3D from Single Image; Datasets and Evaluation; Deep Learning ; RGBD sensors and analytics; Vision Ap,Oral,1.2.3,6,,,,,,\n2452,Supervised Fitting of Geometric Primitives to 3D Point Clouds,Lingxiao Li (Stanford University)*; Minhyuk Sung (Stanford University); Anastasia Dubrovina (Stanford); Li Yi (Stanford); Leonidas Guibas (Stanford University),\"Segmentation, Grouping and Shape\",Deep Learning ; Vision + Graphics,Oral,1.2.3,6,,,,,,\n4225,Do Better ImageNet Models Transfer Better?,Simon Kornblith (Google)*; Jon Shlens (Google); Quoc Le (Google Brain),Representation Learning,Deep Learning,Oral,1.2.3,6,,,,,,\n1213,Learning Video Representations from Correspondence Proposals,Xingyu Liu (Stanford University)*; Joon-Young Lee (Adobe Research); Hailin Jin (Adobe Research),Deep Learning,Action Recognition ; Representation Learning,Oral,2.1.1,1,Deep Learning,,,,,\n1503,SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks,\"Bo Li (SenseTime Group Limited)*; Wei Wu (SenseTime Group Limited); Junjie Yan (Sensetime Group Limited); Qiang Wang (University of Chinese Academy of Sciences); Fangyi Zhang (Institue of Computing Technology); Junliang Xing (Institute of Automation, Chinese Academy of Sciences)\",Deep Learning,Motion and Tracking,Oral,2.1.1,1,,,,,,\n2556,Sphere Generative Adversarial Network Based on Geometric Moment Matching,\"Sung Woo Park (Chung-Ang Univ., Korea); Junseok Kwon (Chung-Ang Univ., Korea)*\",Deep Learning,Image and Video Synthesis,Oral,2.1.1,1,,,,,,\n1431,Adversarial Attacks Beyond the Image Space,xiaohui zeng (toronto); Chenxi Liu (Johns Hopkins University)*; Yu-Siang Wang (National Taiwan University); Weichao Qiu (Johns Hopkins University); Lingxi Xie (Johns Hopkins University); Yu-Wing Tai (Tencent); Chi-Keung Tang (Hong Kong University of Science and Technology); Alan Yuille (Johns Hopkins University),Deep Learning,Vision + Graphics,Oral,2.1.1,2,,,,,,\n5297,Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks,Yinpeng Dong (Tsinghua University)*; Tianyu Pang (Tsinghua University); Hang Su (Tsinghua Univiersity); Jun Zhu (Tsinghua University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,2.1.1,2,,,,,,\n6129,Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses,Jérôme Rony (ÉTS Montréal)*; Luiz Gustavo Hafemann (ÉTS Montréal); Luis Eduardo Oliveira (UFPR); Ismail Ben Ayed (ETS Montreal); Robert Sabourin (Canada); Eric Granger (ETS Montreal ),Deep Learning,Others; Vision Applications and Systems,Oral,2.1.1,2,,,,,,\n1472,A General and Adaptive Robust Loss Function,Jonathan T Barron (Google Research)*,Deep Learning,3D from Single Image; Computer Vision Theory; Image and Video Synthesis; Low-level Vision; Statistic,Oral,2.1.1,3,,,,,,\n2677,Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration,\"Yang He (University of Technology Sydney)*; Ping Liu (UTS); Ziwei Wang (Information Science Academy, CETC); Zhilan Hu (Huawei); Yi Yang (University of Technology, Sydney)\",Deep Learning,Others,Oral,2.1.1,3,,,,,,\n4595,Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss,Sangil Jung (Samsung)*; Changyong Son (Samsung); Seohyung Lee (Samsung); Jinwoo Son (Samsung); Jae-Joon Han (Samsung); Youngjun Kwak (Samsung); Sung Ju Hwang (KAIST); Changkyu Choi (Samsung),Deep Learning,Optimization Methods; Vision Applications and Systems,Oral,2.1.1,3,,,,,,\n1773,Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection,Ruoqi Sun (Shanghai Jiao Tong University)*; Xinge Zhu (The Chinese University of Hong Kong); Chongruo Wu (UC Davis); Chen Huang (Carnegie Mellon University); Jianping Shi (Sensetime Group Limited); Lizhuang Ma (Shanghai Jiao Tong University),Deep Learning,\"Low-level Vision; Segmentation, Grouping and Shape\",Oral,2.1.1,4,,,,,,\n3740,Unsupervised Learning of Dense Shape Correspondence,\"Oshri Halimi (Technion)*; Or Litany (Facebook AI Research); Emanuele Rodola (Sapienza University of Rome); Alex Bronstein (Tel Aviv University, Israel); Ron Kimmel (Technion)\",Deep Learning,\"Computer Vision Theory; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping an\",Oral,2.1.1,4,,,,,,\n6188,RePr: Improved Training of Convolutional Filters,Aaditya Prakash (Brandeis University)*; James Storer (Brandeis University); Dinei Florencio (Microsoft Research); Cha Zhang (Microsoft Research),Deep Learning,,Oral,2.1.1,4,,,,,,\n2726,Balanced Self-Paced Learning for Generative Adversarial Clustering Network,Kamran Ghasedi (University of Pittsburgh)*; Xiaoqian Wang (University of Pittsburgh); Cheng Deng (Xidian University); Heng Huang (University of Pittsburgh),Deep Learning,,Oral,2.1.1,5,,,,,,\n2860,A Style-Based Generator Architecture for Generative Adversarial Networks,Tero Karras (NVIDIA Research)*; Samuli Laine (NVIDIA Research); Timo Aila (NVIDIA Research),Deep Learning,Image and Video Synthesis; Representation Learning,Oral,2.1.1,5,,,,,,\n5426,Parallel Optimal Transport GAN,Gil Avraham (Monash University)*; Yan Zuo (Monash University); Tom Drummond (Monash University),Deep Learning,Optimization Methods; Representation Learning,Oral,2.1.1,5,,,,,,\n1170,3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans,Ji Hou (Technical University of Munich); Angela Dai (Technical University of Munich); Matthias Niessner (Technical University of Munich)*,3D from Multiview and Sensors,\"Segmentation, Grouping and Shape; Vision Applications and Systems\",Oral,2.1.2,1,3D Single View & RGBD,,,,,\n4854,Causes and Corrections for Bimodal Multipath Scanning with Structured Light,yu zhang (Nanjing University )*; Daniel Lau (University of Kentucky); Ying Yu (University of Kentucky),3D from Multiview and Sensors,RGBD sensors and analytics,Oral,2.1.2,1,,,,,,\n7048,TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes,\"Jingwei Huang (Stanford University)*; Haotian Zhang (Stanford University); Li Yi (Stanford); Thomas Funkhouser (Princeton University and Google, Inc.); Matthias Niessner (Technical University of Munich); Leonidas Guibas (Stanford University)\",RGBD sensors and analytics,3D from Multiview and Sensors; Deep Learning ; Vision + Graphics,Oral,2.1.2,1,,,,,,\n704,PlaneRCNN: 3D Plane Detection and Reconstruction from a Single View,Chen Liu (Washington University in St. Louis)*; Kihwan Kim (NVIDIA); Jinwei Gu (NVIDIA); Yasutaka Furukawa (Simon Fraser University); Jan Kautz (NVIDIA),3D from Multiview and Sensors,3D from Single Image; Scene Analysis and Understanding,Oral,2.1.2,2,,,,,,\n3976,Occupancy Networks: Learning 3D Reconstruction in Function Space,\"Lars M Mescheder (MPI-IS and University of Tuebingen)*; Michael Oechsle (MPI-IS, University of Tuebingen and ETAS GmbH); Michael Niemeyer (MPI-IS and University of Tuebingen); Sebastian Nowozin (Google AI Berlin); Andreas Geiger (MPI-IS and University of Tuebingen)\",3D from Single Image,Deep Learning,Oral,2.1.2,2,,,,,,\n2575,3D Shape Reconstruction from Images in the Frequency Domain,\"Weichao Shen (Beijing Institute of Technology)*; Yuwei WU (Beijing Institute of Technology (BIT), China); Yunde Jia (Beijing Institute of Technology)\",3D from Single Image,,Oral,2.1.2,2,,,,,,\n1456,SiCloPe: Silhouette-based Clothed People,Ryota Natsume (Waseda University); Shunsuke Saito (University of Southern California)*; Zeng Huang (University of Southern California); Weikai Chen (USC Institute for Creative Technology); Chongyang Ma (Kwai Inc.); Shigeo Morishima (Waseda Research Institute for Science and Engineering); Hao Li (Pinscreen/University of Southern California/USC ICT),3D from Single Image,\"Face, Gesture, and Body Pose\",Oral,2.1.2,3,,,,,,\n3102,Detailed Human Shape Estimation from a Single Image by Hierarchical Mesh Deformation,\"Hao Zhu (Nanjing University)*; Xinxin Zuo (University of Kentucky); Sen Wang (Northwestern Polytechnical University); Xun Cao (Nanjing University); Ruigang Yang (University of Kentucky, USA)\",3D from Single Image,Deep Learning,Oral,2.1.2,3,,,,,,\n4841,Convolutional Mesh Regression for Single-Image Human Shape Reconstruction,Nikos Kolotouros (University of Pennsylvania)*; Georgios Pavlakos (University of Pennsylvania); Kostas Daniilidis (University of Pennsylvania),3D from Single Image,\"Face, Gesture, and Body Pose\",Oral,2.1.2,3,,,,,,\n2754,H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions,Bugra Tekin (Microsoft)*; Federica Bogo (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft),3D from Single Image,\"Face, Gesture, and Body Pose ; Recognition: Detection, Categorization, Retrieval\",Oral,2.1.2,4,,,,,,\n3419,Learning the Depths of Moving People by Watching Frozen People,Zhengqi Li (Cornell University)*; Tali Dekel (Google); Forrester Cole (Google Research); Richard Tucker (Google); Ce Liu (Google); Bill Freeman (Google); Noah Snavely (Cornell University and Google AI),3D from Single Image,3D from Multiview and Sensors; Deep Learning,Oral,2.1.2,4,,,,,,\n3439,Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion,\"Zhenpei Yang (The University of Texas at Austin); Kristen Grauman (Facebook AI Research & UT Austin); Qixing Huang (The University of Texas at Austin)*; Linjie Luo (Snap Inc); Xiaowei Zhou (Zhejiang Univ., China); Jeffrey Pan (Austin, Texas)\",3D from Single Image,3D from Multiview and Sensors; Deep Learning ; Optimization Methods; RGBD sensors and analytics,Oral,2.1.2,4,,,,,,\n1943,A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images,\"Jiapeng Tang (South China University of Technology); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen))*; Junyi Pan (South China University of Technology); Kui Jia (South China University of Technology); Xin Tong (Microsoft)\",3D from Single Image,Deep Learning,Oral,2.1.2,5,,,,,,\n3451,Structure-And-Motion-Aware Rolling Shutter Correction,Bingbing Zhuang (NUS)*; Quoc-Huy Tran (NEC Labs America); Pan Ji (NEC Labs); Loong Fah Cheong (NUS); Manmohan Chandraker (NEC Labs America),3D from Single Image,3D from Multiview and Sensors,Oral,2.1.2,5,,,,,,\n3871,PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation,\"sida peng (Zhejiang University); Yuan Liu (Zhejiang University); Qixing Huang (The University of Texas at Austin); Hujun Bao (Zhejiang University); Xiaowei Zhou (Zhejiang Univ., China)*\",3D from Single Image,Robotics + Driving; Scene Analysis and Understanding,Oral,2.1.2,5,,,,,,\n236,Learning Optical Flow with Occlusion Hallucination,Pengpeng Liu (The Chinese University of Hong Kong)*; Michael Lyu (The Chinese University of Hong Kong); Irwin King (The Chinese University of Hong Kong); Jia Xu (Tencent AI Lab),Motion and Tracking,Deep Learning,Oral,2.1.3,1,Motion & Biometrics,,,,,\n3963,Taking a Deeper Look at the Inverse Compositional Algorithm,Zhaoyang Lv (GEORGIA TECH)*; Frank Dellaert (Georgia Tech); James Rehg (Georgia Institute of Technology); Andreas Geiger (MPI-IS and University of Tuebingen),Motion and Tracking,3D from Multiview and Sensors; Optimization Methods,Oral,2.1.3,1,,,,,,\n1197,Deeper and Wider Siamese Networks for Real-Time Visual Tracking,Zhipeng Zhang (Chinese Academy of Sciences); Houwen Peng (Microsoft Research)*,Motion and Tracking,,Oral,2.1.3,1,,,,,,\n952,High Fidelity Facial Performance Tracking In-the-wild,Jae Shin Yoon (University of Minnestoa)*; Takaaki Shiratori (Facebook Reality Labs); Shoou-I Yu (Oculus Research Pittsburgh); Hyun Soo Park (The University of Minnesota),\"Face, Gesture, and Body Pose\",3D from Single Image; Motion and Tracking,Oral,2.1.3,2,,,,,,\n2738,Diverse Generation for Multi-agent Sports Games,Raymond A Yeh (UIUC)*; Alexander Schwing (UIUC); Jonathan Huang (Google); Kevin Murphy (Google),Motion and Tracking,Deep Learning,Oral,2.1.3,2,,,,,,\n3444,Efficient Online Multi-Person 2D Pose Tracking with Recurrent Spatio-Temporal Affinity Fields,Yaadhav Raaj (CMU)*; Haroon Idrees (Carnegie Mellon University); Gines Hidalgo Martinez (Carnegie Mellon University); Yaser Sheikh (CMU),\"Face, Gesture, and Body Pose\",Action Recognition ; Motion and Tracking; Video Analytics,Oral,2.1.3,2,,,,,,\n1391,GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching,\"Simone Melzi (University of Verona)*; Riccardo Spezialetti (Universita' degli studi di Bologna); Federico Tombari (Technical University of Munich, Germany); Michael Bronstein (Università della Svizzera Italiana); Luigi Di Stefano (University of Bologna); Emanuele Rodola (Sapienza University of Rome)\",Motion and Tracking,\"3D from Multiview and Sensors; Low-level Vision; Recognition: Detection, Categorization, Retrieval;\",Oral,2.1.3,3,,,,,,\n6191,Eliminating Exposure Bias and Loss-Evaluation Mismatch in Multiple Object Tracking,\"Pascal Fua (EPFL, Switzerland); Andrii Maksai (EPFL)*\",Motion and Tracking,Deep Learning,Oral,2.1.3,3,,,,,,\n3119,Graph Convolutional Tracking,\"Junyu Gao (National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences)*; Tianzhu Zhang (CAS, China); Changsheng Xu (CASIA)\",Motion and Tracking,,Oral,2.1.3,3,,,,,,\n4984,ATOM: Accurate Tracking by Overlap Maximization,Martin Danelljan (ETH Zurich)*; Goutam Bhat (ETH Zurich); Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Michael Felsberg (Linköping University),Motion and Tracking,,Oral,2.1.3,4,,,,,,\n1202,Visual Tracking via Adaptive Spatially-Regularized Correlation Filters,Kenan Dai (Dalian University of Technology); Dong Wang (Dalian University of Technology)*; Huchuan Lu (Dalian University of Technology); Chong Sun (Dalian University of Technology); Jianhua Li (Dalian University of Technology),Motion and Tracking,,Oral,2.1.3,4,,,,,,\n496,Deep Tree Learning for Zero-shot Face Anti-Spoofing,Yaojie Liu (Michigan State University)*; Joel Stehouwer (Michigan State University); Amin Jourabloo (Michigan State University); Xiaoming Liu (Michigan State University),Biometrics,\"Face, Gesture, and Body Pose\",Oral,2.1.3,4,,,,,,\n1140,ArcFace: Additive Angular Margin Loss for Deep Face Recognition,Jiankang Deng (Imperial College London)*; Jia Guo (DeepInsight); Niannan Xue (Imperial College London); Stefanos Zafeiriou (Imperial College Londong),Biometrics,\"Face, Gesture, and Body Pose ; Recognition: Detection, Categorization, Retrieval\",Oral,2.1.3,5,,,,,,\n1617,Learning Joint Unique-Gait and Cross-Gait Representation by Minimizing Quintuplet Loss,\"Kaihao Zhang (Australian National University)*; Wenhan Luo (Tencent AI Lab); Lin Ma (Tencent AI Lab); Wei Liu (Tencent); HONGDONG LI (Australian National University, Australia)\",Biometrics,,Oral,2.1.3,5,,,,,,\n4898,Gait Recognition via Disentangled Representation Learning,Ziyuan Zhang (Michigan State University)*; Luan Tran (Michigan State University); Xi Yin (Microsoft Could & AI); Yousef A Atoum (Yarmouk University); Xiaoming Liu (Michigan State University); Nanxin Wang (Ford Motor Company); Jian Wan (Ford Motor Company),Biometrics,\"Face, Gesture, and Body Pose ; Representation Learning; Vision Applications and Systems\",Oral,2.1.3,5,,,,,,\n1462,Panoptic Feature Pyramid Network,Alexander Kirillov (Facebook AI Reserach)*; Kaiming He (Facebook AI Research); Ross Girshick (FAIR); Piotr Dollar (FAIR),\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Oral,2.2.1,1,Recognition,,,,,\n2705,Mask Scoring R-CNN,Zhaojin Huang (Huazhong University of Science and Technology); Lichao Huang (Horizon Robotics); Yongchao Gong (Horizon Robotics ); Chang Huang (Horizon Robotics); Xinggang Wang (Huazhong Univ. of Science and Technology)*,\"Recognition: Detection, Categorization, Retrieval\",,Oral,2.2.1,1,,,,,,\n3864,Reasoning-RCNN: Unifying Adaptive Global Reasoning into Large-scale Object Detection,Hang Xu (Huawei Noah's Ark Lab); ChenHan Jiang (Sun Yat-sen University); Xiaodan Liang (Sun Yat-sen University)*; Liang Lin (Sun Yat-sen University); Zhenguo Li (Huawei Noah's Ark Lab),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Visual Reasoning,Oral,2.2.1,1,,,,,,\n1476,Cross-Modality Personalization for Retrieval,Nils Murrugarra-Llerena (University of Pittsburgh)*; Adriana Kovashka (University of Pittsburgh),\"Recognition: Detection, Categorization, Retrieval\",Datasets and Evaluation; Vision + Language,Oral,2.2.1,2,,,,,,\n2623,Composing Text and Image for Image Retrieval - An Empirical Odyssey,\"Nam Vo (Georgia Institute of Technology)*; Lu Jiang (Google); Chen Sun (Google); Kevin Murphy (Google); Li-Jia Li (Stanford); Li Fei-Fei (Stanford University); James Hays (Georgia Institute of Technology, USA)\",\"Recognition: Detection, Categorization, Retrieval\",Datasets and Evaluation; Representation Learning; Vision + Language,Oral,2.2.1,2,,,,,,\n3524,Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation,\"Xiaobing Wang (Samsung Research Institute China-Beijing)*; yingying jiang ( Samsung Research China,Beijing); Zhenbo Luo ( Samsung Research Institute China-Beijing); Cheng-lin Liu (Institute of Automation of Chinese Academy of Sciences); Hyunsoo Choi (SAMSUNG ELECTRONICS CO.,LTD); Sungjin Kim (SAMSUNG ELECTRONICS CO.,LTD)\",\"Recognition: Detection, Categorization, Retrieval\",Vision Applications and Systems,Oral,2.2.1,2,,,,,,\n2657,Adaptive NMS: Refining Pedestrian Detection in a Crowd,\"Songtao Liu (BUAA); Di Huang (Beihang University, China)*; Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China)\",\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose\",Oral,2.2.1,3,,,,,,\n3517,\"Point in, Box out: Beyond Counting Persons in Crowds\",yuting liu (sichuan university)*; Miaojing Shi (Inria Rennes); Qijun Zhao (Sichuan University); Xiaofang Wang (Inria Rennes),\"Recognition: Detection, Categorization, Retrieval\",,Oral,2.2.1,3,,,,,,\n6264,Locating Objects Without Bounding Boxes,Javier Ribera (Purdue University)*; David Güera (Purdue University); Yuhao Chen (Purdue University); Edward Delp (Purdue University),\"Recognition: Detection, Categorization, Retrieval\",,Oral,2.2.1,3,,,,,,\n3333,FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery,\"Krishna Kumar Singh (University of California Davis)*; Utkarsh Ojha (University of California, Davis); Yong Jae Lee (University of California, Davis)\",\"Recognition: Detection, Categorization, Retrieval\",Image and Video Synthesis,Oral,2.2.1,4,,,,,,\n3505,Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification,Si Wu (South China University of Technology)*; Jichang Li (South China University of Technology); Cheng Liu (City University of Hong Kong); Zhiwen Yu (South China University of Technology); Hau San Wong (City University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2.1,4,,,,,,\n4012,Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects,\"Yusuke Niitani (Preferred Networks, Inc.)*; Takuya Akiba (Preferred Networks, Inc.); Tommi Kerola (Preferred Networks, Inc.); Toru Ogawa (Preferred Networks, Inc.); Shotaro Sano (Preferred Networks, Inc.); Shuji Suzuki (Preferred Networks, Inc.)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2.1,4,,,,,,\n4099,Curls & Whey: Boosting Black-Box Adversarial Attacks,Yucheng Shi (Tianjin University); Siyu Wang (Tianjin University); Yahong Han (Tianjin University)*,\"Recognition: Detection, Categorization, Retrieval\",Vision Applications and Systems,Oral,2.2.1,5,,,,,,\n5988,Barrage of Random Transforms for Adversarially Robust Defense,Edward Raff (Booz Allen Hamilton)*; Jared Sylvester (Booz Allen Hamilton); Steven Forsyth (Nvidia); Mark McLean (Laboratory for Physical Sciences),\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods ; Deep Learning\",Oral,2.2.1,5,,,,,,\n4648,Aggregation Cross-Entropy for Sequence Recognition,Zecheng Xie (South China University of Technology); Yaoxiong Huang (South China University of Technology); Yuanzhi Zhu (South China University of Technology); Lianwen Jin (South China University of Technology)*; Yuliang Liu (South China University of Technology); Lele Xie (South China University of Technology),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Document Analysis ; Statistical Learning; Vision Applications and Systems,Oral,2.2.1,5,,,,,,\n4674,LaSO: Label-Set Operations networks for multi-label few-shot learning,\"Amit Alfassy (IBM-Research); Leonid Karlinsky (IBM-Research)*; Amit Aides (IBM); Joseph Shtok (IBM-Reseach); Sivan Harary (IBM-Research); Rogerio Feris (IBM Research AI, MIT-IBM Watson AI Lab); Raja Giryes (Tel Aviv University); Alex Bronstein (Technion)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2.1,6,,,,,,\n5352,Few-Shot Learning with Localization in Realistic Settings,Davis Wertheimer (Cornell)*; Bharath Hariharan (Cornell University),\"Recognition: Detection, Categorization, Retrieval\",\"Deep Learning ; Segmentation, Grouping and Shape\",Oral,2.2.1,6,,,,,,\n5575,AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs,Massimiliano Mancini (Sapienza University of Rome)*; Samuel Rota Bulò (Mapillary Research); Barbara Caputo (IIT); Elisa Ricci (FBK - Technologies of Vision),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2.1,6,,,,,,\n12,Grounded Video Description,Luowei Zhou (University of Michigan)*; Yannis Kalantidis (Facebook Research); Xinlei Chen (Facebook AI Research); Jason J Corso (University of Michigan); Marcus Rohrbach (Facebook AI Research),Vision + Language,,Oral,2.2.2,1,Language & Reasoning,,,,,\n3566,Streamlined Dense Video Captioning,Jonghwan Mun (POSTECH)*; Linjie Yang (ByteDance AI Lab); Zhou Ren (Snap Inc.); Ning Xu (Snap); Bohyung Han (Seoul National University),Vision + Language,Deep Learning ; Video Analytics,Oral,2.2.2,1,,,,,,\n5612,Adversarial Inference for Multi-Sentence Video Description,Jae Sung Park (UC Berkeley); Marcus Rohrbach (Facebook AI Research); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley)*,Vision + Language,,Oral,2.2.2,1,,,,,,\n4705,Unified Visual-Semantic Emebddings: Bridging Vision and Language with Structured Meaning Representations,\"Hao Wu (Fudan University)*; Jiayuan Mao (Tsinghua University); Yufeng Zhang (Fudan University); Weiwei Sun (\"\" Fudan University, China\"\"); Yuning Jiang (Bytedance); Lei Li (ByteDance AI Lab); Weiying Ma (Bytedance)\",Vision + Language,Representation Learning,Oral,2.2.2,2,,,,,,\n3640,Learning to Compose Dynamic Tree Structures for Visual Contexts,Kaihua Tang (Nanyang Technological University)*; Hanwang Zhang (Nanyang Technological University); Baoyuan Wu (Tencent AI Lab); Wenhan Luo (Tencent AI Lab); Wei Liu (Tencent),Vision + Language,Scene Analysis and Understanding; Visual Reasoning,Oral,2.2.2,2,,,,,,\n5104,Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation,\"Xin Wang (University of California, Santa Barbara)*; Qiuyuan Huang (Microsoft Research AI); Asli Celikyilmaz (Microsoft Research AI); Jianfeng Gao (Microsoft Research); Dinghan Shen (Duke University); Yuan-Fang Wang (UC Santa Barbara); William Yang Wang (UC Santa Barbara); Lei Zhang (Microsoft Research)\",Vision + Language,Robotics + Driving; Vision Applications and Systems; Visual Reasoning,Oral,2.2.2,2,,,,,,\n1824,Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering,\"gao peng (Chinese university of hong kong)*; Hongsheng Li (Chinese University of Hong Kong); Haoxuan You (Tsinghua University); Zhengkai Jiang (Institute of Automation，Chinese Academy of Sciences); Pan Lu (Tsinghua University); Steven Hoi (SMU); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",Vision + Language,Vision Applications and Systems; Visual Reasoning,Oral,2.2.2,3,,,,,,\n3454,Cycle-Consistency for Robust Visual Question Answering,Meet Shah (Facebook AI Research)*; Xinlei Chen (Facebook AI Research); Marcus Rohrbach (Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research),Vision + Language,,Oral,2.2.2,3,,,,,,\n135,Embodied Question Answering in Photorealistic Environments with Point Cloud Perception,Erik Wijmans (Georgia Tech)*; Samyak Datta (Georgia Tech); Oleksandr Maksymets (Facebook AI Research); Abhishek Das (Georgia Tech); Georgia Gkioxari (Facebook); Stefan Lee (Georgia Institute of Technology); Irfan Essa (Georgia Institute of Technology); Dhruv Batra (Georgia Tech & Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research),Vision + Language,,Oral,2.2.2,3,,,,,,\n3909,Reasoning Visual Dialogs with Structural and Partial Observations,Zilong Zheng (UCLA); Wenguan Wang (Inception Institute of Artificial Intelligence)*; Siyuan Qi (UCLA); Song-Chun Zhu (UCLA),Vision + Language,,Oral,2.2.2,4,,,,,,\n3129,Recursive Visual Attention in Visual Dialog,Yulei Niu (Renmin University of China); Manli Zhang (Renmin University of China); Jianhong Zhang (Renmin University of China); Zhiwu Lu (Renmin University of China)*; Ji-Rong Wen (Renmin University of China); Hanwang Zhang (Nanyang Technological University),Vision + Language,Visual Reasoning,Oral,2.2.2,4,,,,,,\n3820,Two Body Problem: Collaborative Visual Task Completion,\"Unnat Jain (UIUC)*; Luca Weihs (Allen Institute for Artificial Intelligence); Eric Kolve (Allen AI); Mohammad Rastegari (Allen Institute for Artificial Intelligence); Svetlana Lazebnik (UIUC); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Alexander Schwing (UIUC); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence)\",Visual Reasoning,\"Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding\",Oral,2.2.2,4,,,,,,\n7021,GQA: a new dataset for compositional question answering over real-world images,Drew A Hudson (Stanford University)*; Chris Manning (Stanford),Visual Reasoning,Datasets and Evaluation; Deep Learning ; Scene Analysis and Understanding; Vision + Language,Oral,2.2.2,5,,,,,,\n1530,Text2Scene: Generating Compositional Scenes from Textual Descriptions,Fuwen Tan (University of Virginia)*; Song Feng (IBM Research); Vicente Ordonez (University of Virginia),Vision + Language,Image and Video Synthesis,Oral,2.2.2,5,,,,,,\n5126,From Recognition to Cognition: Visual Commonsense Reasoning,\"Rowan Zellers (University of Washington)*; Yonatan Bisk (University of Washington); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Yejin Choi (University of Washington)\",Vision + Language,\"Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding; Visual Reasonin\",Oral,2.2.2,5,,,,,,\n3587,The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation,Chih-Yao Ma (Georgia Institute of Technology)*; Zuxuan Wu (UMD); Ghassan AlRegib (Georgia Institute of Technology �); Caiming Xiong (Salesforce Research); Zsolt Kira (Georgia Institute of Technology),Vision + Language,Visual Reasoning,Oral,2.2.2,6,,,,,,\n6287,Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation,Liyiming Ke (University of Washington); Xiujun Li (Microsoft Research)*; Yonatan Bisk (University of Washington); Ari Holtzman (University of Washington); Zhe Gan (Microsoft); Jingjing Liu (Microsoft); Jianfeng Gao (Microsoft Research); Yejin Choi (University of Washington); Siddhartha Srinivasa (University of Washington),Vision + Language,Robotics + Driving,Oral,2.2.2,6,,,,,,\n1770,Learning to Learn How to Learn: Self-Adaptive Visual Navigation using Meta-Learning,\"Mitchell N Wortsman (Allen Institute for Artificial Intelligence); Kiana Ehsani (University of Washington); Mohammad Rastegari (Allen Institute for Artificial Intelligence); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Roozbeh Mottaghi (Allen Institute for AI)*\",Visual Reasoning,Scene Analysis and Understanding; Statistical Learning,Oral,2.2.2,6,,,,,,\n1064,From One Photon to a Billion: High Flux Imaging with Single-Photon Sensors,\"Atul N Ingle (University of Wisconsin-Madison)*; Andreas Velten (University of Wisconsin - Madison); Mohit Gupta (\"\"University of Wisconsin-Madison, USA \"\")\",Computational Photography,Physics-based Vision and Shape-from-X,Oral,2.2.3,1,Comp. Photography & Graphics,,,,,\n1624,Photon-Flooded Single-Photon 3D Cameras,\"Anant Gupta (University of Wisconsin Madison)*; Atul N Ingle (University of Wisconsin-Madison); Andreas Velten (University of Wisconsin - Madison); Mohit Gupta (\"\"University of Wisconsin-Madison, USA \"\")\",Computational Photography,Physics-based Vision and Shape-from-X,Oral,2.2.3,1,,,,,,\n2059,Acoustic Non-Line-of-Sight Imaging,David Lindell (Stanford University)*; Gordon Wetzstein (Stanford University); Vladlen Koltun (Intel Labs),Computational Photography,,Oral,2.2.3,1,,,,,,\n3310,Steady-state Non-Line-of-Sight Imaging,Wenzheng Chen (University of Toronto); Simon Daneau (Algolux)*; Colin Brosseau (Algolux); Felix Heide (Princeton University),Computational Photography,Low-level Vision; Physics-based Vision and Shape-from-X,Oral,2.2.3,2,,,,,,\n2427,A Theory of Fermat Paths for Non-Line-of-Sight Shape Reconstruction,Shumian Xin (Carnegie Mellon University); Sotiris Nousias (University College London); Kyros Kutulakos (University of Toronto); Aswin Sankaranarayanan (Carnegie Mellon University); Srinivasa G Narasimhan (Carnegie Mellon University); Ioannis Gkioulekas (Carnegie Mellon University)*,Computational Photography,Physics-based Vision and Shape-from-X,Oral,2.2.3,2,,,,,,\n474,End-to-end Projector Photometric Compensation,Bingyao Huang (Temple University)*; Haibin Ling (Temple University),Vision + Graphics,Computational Photography; Deep Learning ; Others,Oral,2.2.3,2,,,,,,\n2605,Bringing a Blurry Frame Alive at High Frame-Rate with an Event Camera,\"Liyuan Pan (The Australian National University)*; cedric scheerlinck (The Australian National University); RICHARD HARTLEY (Australian National University, Australia); Miaomiao Liu (The Australian National University); Yuchao Dai (Northwestern Polytechnical University); Xin Yu (Australian National University)\",Computational Photography,Image and Video Synthesis; Vision Applications and Systems,Oral,2.2.3,3,,,,,,\n5932,Bringing Alive Blurred Moments!,\"Kuldeep Purohit (Indian Institute of Technology Madras)*; Anshul Shah (University of Maryland, College Park); Rajagopalan N Ambasamudram (Indian Institute of Technology Madras)\",Computational Photography,Low-level Vision,Oral,2.2.3,3,,,,,,\n1607,Learning to Synthesize Motion Blur,Tim Brooks (Google)*; Jonathan T Barron (Google Research),Computational Photography,Deep Learning ; Image and Video Synthesis; Motion and Tracking; Vision + Graphics,Oral,2.2.3,3,,,,,,\n2861,Underexposed Photo Enhancement using Deep Illumination Estimation,\"Ruixing Wang (The Chinese University of Hong Kong); Qing Zhang ( Sun Yat-sen University); Chi-Wing Fu (The Chinese University of Hong Kong); Xiaoyong Shen (Tencent); WEI-SHI ZHENG (Sun Yat-sen University, China)*; Jiaya Jia (Chinese University of Hong Kong)\",Computational Photography,Deep Learning ; Low-level Vision,Oral,2.2.3,4,,,,,,\n2843,Blind Visual Motif Removal from a Single Image,Amir Hertz (Tel Aviv University)*; Sharon Fogel (Tel-Aviv university); Rana Hanocka (TAU); Raja Giryes (Tel Aviv University); Danny Cohen-Or (Tel Aviv University),Vision + Graphics,Deep Learning,Oral,2.2.3,4,,,,,,\n6541,Non-local Meets Global: An Integrated Paradigm for Hyperspectral Denoising,Wei He (RIKEN AIP)*; Quanming Yao (4Paradigm); Chao Li (RIKEN); Naoto Yokoya (RIKEN Center for Advanced Intelligence Project (AIP)); Qibin Zhao (RIKEN),Low-level Vision,Vision + Graphics,Oral,2.2.3,4,,,,,,\n4943,Total Scene Capture: Neural Rerendering in the Wild,\"Moustafa Meshry (University of Maryland)*; Ricardo Martin-Brualla (Google); Noah Snavely (Cornell University and Google AI); Hugues Hoppe (Google Inc.); Sameh Khamis (Google); Rohit Pandey (Google); Dan B Goldman (Google, Inc.)\",Vision + Graphics,3D from Multiview and Sensors; Image and Video Synthesis,Oral,2.2.3,5,,,,,,\n430,GeoNet: Deep Geodesic Networks for Point Cloud Analysis,Tong He (UCLA)*; Haibin Huang (Face++ (Megvii)); Li Yi (Stanford); Yuqian Zhou (UIUC); QIHAO WU (Face++ (Megvii)); jue wang (Face++ (Megvii)); Stefano Soatto (UCLA),Vision + Graphics,\"Deep Learning ; Representation Learning; Segmentation, Grouping and Shape\",Oral,2.2.3,5,,,,,,\n2440,MeshAdv: Adversarial Meshes for Visual Recognition,\"CHAOWEI XIAO (University of Michigan, Ann Arbor); Dawei Yang (University of Michigan, Ann Arbor)*; Bo Li (University of Illinois at Urbana–Champaign); Jia Deng (Princeton University); mingyan liu (university of Michigan, Ann Arbor)\",Vision + Graphics,Deep Learning,Oral,2.2.3,5,,,,,,\n4701,Fast Spatially-Varying Indoor Lighting Estimation,Mathieu Garon (Université Laval); Kalyan Sunkavalli (Adobe Research); Nathan Carr (Adobe); Sunil Hadap (Adobe); Jean-Francois Lalonde (Université Laval)*,Vision + Graphics,Computational Photography; Deep Learning,Oral,2.2.3,6,,,,,,\n1188,Neural Illumination: Lighting Prediction for Indoor Environments,\"Shuran Song (Princeton)*; Thomas Funkhouser (Princeton University and Google, Inc.)\",Vision + Graphics,3D from Single Image; Scene Analysis and Understanding,Oral,2.2.3,6,,,,,,\n4363,Deep Sky Modeling for Single Image Outdoor Lighting Estimation,Yannick Hold-Geoffroy (Adobe Research)*; Akshaya Athwale (Indian Institute of Technology Dhanbad); Jean-Francois Lalonde (Université Laval),Vision + Graphics,Computational Photography; Deep Learning ; Scene Analysis and Understanding,Oral,2.2.3,6,,,,,,\n2180,Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning from Radiology Reports and Label Ontology,\"Ke Yan (National Institutes of Health)*; Yifan Peng (NIH); Veit Sanfort (NIH); Mohammadhadi Bagheri (National Institutes of Health); Zhiyong Lu (NLM/NCBI/NIH); Ronald Summers (National Institutes of Health, Bethesda, Maryland, United States)\",\"Medical, Biological and Cell Microscopy\",\"Datasets and Evaluation; Deep Learning ; Recognition: Detection, Categorization, Retrieval; Represen\",Oral,3.1.1,1,Applications,,,,,\n4246,Robust Histopathology Image Analysis: to Label or to Synthesize?,Le Hou (Stony Brook University)*; Ayush Agarwal (Stanford University); Dimitris Samaras (Stony Brook University); Tahsin Kurc (Stony Brook University); Rajarsi Gupta (Stony Brook University); Joel Saltz (Stony Brook University),\"Medical, Biological and Cell Microscopy\",\"Segmentation, Grouping and Shape; Vision Applications and Systems\",Oral,3.1.1,1,,,,,,\n6477,Data augmentation with spatial and appearance transforms for one-shot medical image segmentation,Amy Zhao (MIT)*; Guha Balakrishnan (MIT); Fredo Durand (MIT); John Guttag (MIT); Adrian V Dalca (MIT),\"Medical, Biological and Cell Microscopy\",Image and Video Synthesis,Oral,3.1.1,1,,,,,,\n1853,Shifting More Attention to Video Salient Object Detection,Deng-Ping Fan (Nankai University); Wenguan Wang (Inception Institute of Artificial Intelligence); Ming-Ming Cheng (Nankai University)*; Jianbing Shen (Beijing Institute of Technology),Vision Applications and Systems,Low-level Vision,Oral,3.1.1,2,,,,,,\n864,Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration,De-An Huang (Stanford University)*; Suraj Nair (Stanford University); Danfei Xu (Stanford University); Yuke Zhu (Stanford University); Animesh Garg (Stanford University); Li Fei-Fei (Stanford University); Silvio Savarese (Stanford University); Juan Carlos Niebles (Stanford University),Robotics + Driving,Visual Reasoning,Oral,3.1.1,2,,,,,,\n1296,Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry,\"Fei Xue (Peking University)*; Xin Wang (Peking University); Shunkai Li (Peking University); Qiuyuan Wang (Peking University); Junqiu Wang (Beijing Changcheng Aviation Measurement and Control Institute); Hongbin Zha (Peking University, China)\",Robotics + Driving,Deep Learning ; Motion and Tracking,Oral,3.1.1,2,,,,,,\n3139,Image Generation from Layout,Bo Zhao (University of British Columbia)*; Lili Meng (University of British Columbia); Weidong Yin (University of British Columbia); Leonid Sigal (University of British Columbia),Vision Applications and Systems,Vision + Graphics,Oral,3.1.1,3,,,,,,\n4603,Multimodal Explanations by Predicting Counterfactuality in Videos,Atsushi Kanehira (The University of Tokyo)*; Kentaro Takemoto (University of Tokyo); Sho Inayoshi (The University of Tokyo); Tatsuya Harada (The University of Tokyo),Vision Applications and Systems,Video Analytics; Vision + Language,Oral,3.1.1,3,,,,,,\n4606,Learning to Explain with Complemental Examples,Atsushi Kanehira (The University of Tokyo)*; Tatsuya Harada (The University of Tokyo),Vision Applications and Systems,Vision + Language,Oral,3.1.1,3,,,,,,\n3441,HAQ: Hardware-Aware Automated Quantization,Kuan Wang (MIT); Zhijian Liu (MIT); Yujun Lin (MIT); Ji Lin (MIT); Song Han (MIT)*,Vision Applications and Systems,Deep Learning,Oral,3.1.1,4,,,,,,\n4965,Content Authentication for Neural Imaging Pipelines: End-to-end Optimization of Photo Provenance in Complex Distribution Channels,Pawel Korus (New York University)*; Nasir Memon (New York University),Vision Applications and Systems,Low-level Vision,Oral,3.1.1,4,,,,,,\n5712,Inverse Procedural Modeling of Knitwear,Elena Trunz (University of Bonn)*; Sebastian Merzbach (University of Bonn); Jonathan Klein (University of Bonn); Thomas Schulze (University of Bonn); Michael Weinmann (University of Bonn); Reinhard Klein (University of Bonn),Vision Applications and Systems,,Oral,3.1.1,4,,,,,,\n2857,Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video,Zongmian Li (INRIA Paris)*; Jiri Sedlar (CVUT); Justin Carpentier (INRIA); Ivan Laptev (INRIA Paris); Nicolas Mansard (LAAS-CNRS); Josef Sivic (INRIA),Robotics + Driving,Action Recognition ; Video Analytics,Oral,3.1.1,5,,,,,,\n4235,DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds,Li Ding (University of Rochester); Chen Feng (New York University)*,Robotics + Driving,3D from Multiview and Sensors; Deep Learning ; RGBD sensors and analytics,Oral,3.1.1,5,,,,,,\n4880,End-to-end Interpretable Neural Motion Planner,\"Wenyuan Zeng (Uber ATG, University of Toronto); Wenjie Luo (Uber ATG / University of Toronto)*; Shun Da Suo (Uber ATG, University of Toronto); Abbas Sadat (Uber ATG); Bin Yang (Uber ATG & University of Toronto); Sergio Casas Romero (Uber ATG, University of Toronto); Raquel Urtasun (Uber ATG)\",Robotics + Driving,,Oral,3.1.1,5,,,,,,\n2609,\"Divergence Triangle for Joint Training of Generator Model, Energy-based Model, and Inferential Model\",\"Tian Han (University of California, Los Angeles)*; Erik Nijkamp (UCLA); Xiaolin Fang (Zhejiang University); Mitchell K Hill (UCLA Department of Statistics); Song-Chun Zhu (UCLA); Ying Nian Wu (University of California, Los Angeles)\",Statistical Learning,Deep Learning ; Image and Video Synthesis,Oral,3.1.2,1,\"Learning, Physics, Theory, & Datasets\",,,,,\n2829,IMAGE DEFORMATION META-NETWORK FOR ONE-SHOT LEARNING,Zitian Chen (Fudan University); Yanwei Fu (Fudan University)*; Yu-Xiong Wang (Carnegie Mellon University); Lin Ma (Tencent AI Lab); Wei Liu (Tencent); Martial Hebert (Carnegie Mellon University),Statistical Learning,Representation Learning,Oral,3.1.2,1,,,,,,\n4917,Online high-rank matrix completion,Jicong Fan (Cornell University)*; Madeleine Udell (Cornell University),Statistical Learning,\"Big Data, Large Scale Methods ; Motion and Tracking\",Oral,3.1.2,1,,,,,,\n1274,Multispectral Imaging for Fine-Grained Recognition of Powders on Complex Backgrounds,Tiancheng Zhi (Carnegie Mellon University)*; Bernardo Pires (-); Martial Hebert (Carnegie Mellon School of Computer Science); Srinivasa G Narasimhan (Carnegie Mellon University),Physics-based Vision and Shape-from-X,Computational Photography,Oral,3.1.2,2,,,,,,\n2138,ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging,\"Samarth Brahmbhatt (Georgia Institute of Technology)*; Cusuh Ham (Georgia Institute of Technology); Charlie Kemp (Georgia Institute of Technology); James Hays (Georgia Institute of Technology, USA)\",Datasets and Evaluation,\"Others; Robotics + Driving; Segmentation, Grouping and Shape\",Oral,3.1.2,2,,,,,,\n4535,Robust Subspace Clustering with Independent and Piecewise Identically Distributed Noise Modeling,Yuanman Li (University of Macau); Jiantao Zhou (University of Macau)*; Xianwei Zheng (Foshan University); Jinyu Tian (University of Macau); Yuan Yan Tang (University of Macau),Computer Vision Theory,\"Representation Learning; Segmentation, Grouping and Shape; Vision Applications and Systems\",Oral,3.1.2,2,,,,,,\n4185,What Correspondences Reveal about Unknown Camera and Motion Models?,Thomas Probst (ETH Zurich)*; Ajad Chhatkuli (ETH Zurich); Danda Pani Paudel (ETH Zürich); Luc Van Gool (ETH Zurich),Computer Vision Theory,3D from Multiview and Sensors,Oral,3.1.2,3,,,,,,\n1504,Self-calibrating Deep Photometric Stereo Networks,Guanying Chen (The University of Hong Kong)*; Kai Han (University of Oxford); Boxin Shi (Peking University); Yasuyuki Matsushita (Osaka University); Kwan-Yee K. Wong (The University of Hong Kong),Physics-based Vision and Shape-from-X,Computational Photography,Oral,3.1.2,3,,,,,,\n4994,Know Before You Go: 3D Tracking and Forecasting with Rich Maps,\"John W Lambert (Georgia Institute of Technology)*; James Hays (Georgia Institute of Technology, USA); Jagjeet Singh (CMU); MingFang Chang (Carnegie Mellon University); Simon Lucey (CMU); Deva Ramanan (Carnegie Mellon University); Patsorn Sangkloy (Georgia Institute of Technology); De Wang (Argo AI); Pete Carr (Argo AI)\",Datasets and Evaluation,3D from Multiview and Sensors; Robotics + Driving,Oral,3.1.2,3,,,,,,\n5176,Side Window Filtering,Hui Yin (Shenzhen University); Yuanhao Gong (Shenzhen University); Guoping Qiu (Shenzhen University)*,Computer Vision Theory,Low-level Vision; Others,Oral,3.1.2,4,,,,,,\n2319,Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search,Abhimanyu Dubey (Massachusetts Institute of Technology)*; Laurens van der Maaten (Facebook); Zeki Yalniz (Facebook); Yixuan Li (Facebook Research); Dhruv Mahajan (Facebook),\"Big Data, Large Scale Methods\",Datasets and Evaluation; Deep Learning ; Representation Learning; Visual Reasoning,Oral,3.1.2,4,,,,,,\n1519,Incremental Object Learning from Contiguous Views,Stefan Stojanov (Georgia Institute of Technology)*; Samarth Mishra (Georgia Institute of Technology); Ngoc Anh Thai (Georgia Institute of Technology); Nikhil Dhanda (Georgia Institute of Technology); Ahmad Humayun (Georgia Institute of Technology); Linda Smith (Indiana University); Chen Yu (Indiana University); James Rehg (Georgia Institute of Technology),Datasets and Evaluation,\"Deep Learning ; Image and Video Synthesis; Recognition: Detection, Categorization, Retrieval\",Oral,3.1.2,4,,,,,,\n3627,IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition,Xiaoping Wu (Nankai University); Chi Zhan (Nankai University); Yukun Lai (Cardiff University); Ming-Ming Cheng (Nankai University); Jufeng Yang (Nankai University )*,Datasets and Evaluation,Vision Applications and Systems,Oral,3.1.2,5,,,,,,\n6334,CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification,ZHENG TANG (University of Washington)*; Milind Naphade (NVidia); Ming-Yu Liu (NVIDIA); Xiaodong Yang (NVIDIA Research); Stan Birchfield (NVIDIA); Shuo Wang (NVidia); Ratnesh Kumar (NVIDIA); David Anastasiu (SJSU); Jenq-Neng Hwang (University of WA�),Datasets and Evaluation,\"Big Data, Large Scale Methods ; Motion and Tracking\",Oral,3.1.2,5,,,,,,\n6439,Social-IQ: A Question Answering Benchmark for Open-ended Social Intelligence,Amir Zadeh (CMU LTI)*; Michael K Chan (CMU); Paul Pu Liang (Carnegie Mellon University); Edmund Tong (CMU); Louis-Philippe Morency (Carnegie Mellon University),Datasets and Evaluation,Vision + Language ; Visual Reasoning,Oral,3.1.2,5,,,,,,\n3471,UPSNet: A Unified Panoptic Segmentation Network,Yuwen Xiong (Uber ATG; University of Toronto)*; Renjie Liao (University of Toronto); Hengshuang Zhao (The Chinese University of Hong Kong); Rui Hu (Uber); Min Bai (University of Toronto); Ersin Yumer (Uber ATG); Raquel Urtasun (Uber ATG),Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape;\",Oral,3.1.3,1,Grouping,,,,,\n5828,Joint Semantic-Instance Segmentation of 3D Point Clouds Using Multi-Set Label Conditional Random Fields,\"Quang-Hieu Pham (Singapore University of Technology and Design)*; Binh-Son Hua (The University of Tokyo); Thanh Nguyen (Deakin University, Australia); Gemma Roig (MIT); Sai-Kit Yeung (Hong Kong University of Science and Technology)\",Scene Analysis and Understanding,\"Deep Learning ; Segmentation, Grouping and Shape\",Oral,3.1.3,1,,,,,,\n1303,Proposal-free instance segmentation with a clustering loss function,Davy Neven (KULeuven)*; Bert De Brabandere (KU Leuven); Marc Proesmans (KU Leuven); Luc Van Gool (ETH Zurich),\"Segmentation, Grouping and Shape\",\"Recognition: Detection, Categorization, Retrieval\",Oral,3.1.3,1,,,,,,\n153,Deep Instance Co-segmentation by Co-peak Search and Co-saliency Detection,Kuang-Jui Hsu (Academia Sinica)*; Yen-Yu Lin (Academia Sinica); Yung-Yu Chuang (National Taiwan University),\"Segmentation, Grouping and Shape\",Datasets and Evaluation; Deep Learning ; Low-level Vision,Oral,3.1.3,2,,,,,,\n4250,Improving Semantic Segmentation via Video Propagation and Label Relaxation,Yi Zhu (UC Merced); Karan Sapra (NVIDIA)*; Fitsum Reda (NVIDIA); Kevin Shih (NVIDIA); Shawn Newsam (UC Merced); Andrew Tao (NVIDIA); Bryan Catanzaro (NVIDIA),\"Segmentation, Grouping and Shape\",Deep Learning ; Image and Video Synthesis,Oral,3.1.3,2,,,,,,\n3121,Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video,Samvit Jain (UC Berkeley)*; Xin Wang (UC Berkeley); Joey Gonzalez (Berkeley),\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Oral,3.1.3,2,,,,,,\n4311,Shape2Motion: Joint Analysis of Motion Parts and Attributes from 3D Shapes,\"Xiaogang Wang (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University); Kai Xu (National University of Defense Technology)*; Yahao Shi (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University); Bin Zhou (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University); Xiaowu Chen (); Qinping Zhao (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University)\",\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding; Vision + Graphics,Oral,3.1.3,3,,,,,,\n256,Semantic Correlation Promoted Shape-Variant Context for Segmentation,Henghui Ding (Nanyang Technological University)*; Xudong Jiang (Nanyang Technological University); Bing Shuai (Amazon); Ai Qun Liu (Nanyang Technological University); Gang Wang (Alibaba Group),\"Segmentation, Grouping and Shape\",Deep Learning ; Scene Analysis and Understanding,Oral,3.1.3,3,,,,,,\n2493,Relation-Shape Convolutional Neural Network for Point Cloud Analysis,\"Yongcheng Liu (Institute of Automation, Chinese Academy of Sciences)*; Bin Fan (Institute of Automation, Chinese Academy of Sciences, China); SHIMING XIANG (Chinese Academy of Sciences, China); Chunhong Pan (Institute of Automation, Chinese Academy of Sciences)\",\"Segmentation, Grouping and Shape\",Deep Learning ; Representation Learning,Oral,3.1.3,3,,,,,,\n1233,Enhancing Diversity of Defocus Blur Detectors via Cross-Ensemble Network,Wenda Zhao (Dalian University of Technology)*; Bowen Zheng (Dalian University of Technology); Qiuhua Lin (Dalian University of Technology); Huchuan Lu (Dalian University of Technology),\"Segmentation, Grouping and Shape\",Deep Learning,Oral,3.1.3,4,,,,,,\n2793,BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames,Brent Griffin (University of Michigan)*; Jason J Corso (University of Michigan),\"Segmentation, Grouping and Shape\",Datasets and Evaluation; Deep Learning ; Motion and Tracking,Oral,3.1.3,4,,,,,,\n3074,Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-high Resolution Images,Wuyang Chen (Texas A&M University)*; Ziyu Jiang (Texas A&M University); Zhangyang Wang (TAMU); Kexin Cui (Texas A&M University); Xiaoning Qian (Texas A&M University),\"Segmentation, Grouping and Shape\",Deep Learning,Oral,3.1.3,4,,,,,,\n3203,Efficient Parameter-free Clustering Using First Neighbor Relations,Saquib Sarfraz (Karlsruhe Institute of Technology)*; Vivek Sharma (Karlsruhe Institute of Technology); Rainer Stiefelhagen (Karlsruhe Institute of Technology),\"Segmentation, Grouping and Shape\",\"Big Data, Large Scale Methods ; Others; Representation Learning; Vision Applications and Systems\",Oral,3.1.3,5,,,,,,\n3859,Learning Personalized Modular Network Guided by Structured Knowledge,Xiaodan Liang (Sun Yat-sen University)*,\"Segmentation, Grouping and Shape\",Deep Learning,Oral,3.1.3,5,,,,,,\n5616,A Generative Appearance Model for End-to-end Video Object Segmentation,Joakim Johnander (Linköping University)*; Martin Danelljan (ETH Zurich); Emil Brissman (Linköping University); Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Michael Felsberg (Linköping University),\"Segmentation, Grouping and Shape\",Motion and Tracking,Oral,3.1.3,5,,,,,,\n3041,Practical Full Resolution Learned Lossless Image Compression,Fabian Mentzer (ETH Zurich)*; Eirikur Agustsson (ETH Zurich); Michael Tschannen (ETH Zurich); Radu Timofte (ETH Zurich); Luc Van Gool (ETH Zurich),Deep Learning,Representation Learning,Oral,3.2.1,1,Deep Learning,,,,,\n6930,Image-to-Image Translation via Group-wise Deep Whitening and Coloring,Wonwoong Cho (Korea University); Sungha Choi (Korea University); David Park (Korea University); Inkyu Shin (Hanyang University); Jaegul Choo (Korea University)*,Deep Learning,Computer Vision Theory; Image and Video Synthesis; Vision Applications and Systems,Oral,3.2.1,1,,,,,,\n6265,Max-Sliced Wasserstein Distance and its use for GANs,\"Ishan Deshpande (-); Yuan-Ting Hu (University of Illinois at Urbana-Champaign); Ruoyu Sun (University of Illinois at Urbana-Champaign); Ayis Pyrros (Dupagemd); Nasir Siddiqui (Dupagemd); Sanmi Koyejo (University of Illinois, Urbana-Champaign); Zhizhen Zhao (University of Illinois at Urbana-Champaign); David Forsyth (Univeristy of Illinois at Urbana-Champaign); Alexander Schwing (UIUC)*\",Deep Learning,,Oral,3.2.1,1,,,,,,\n3073,Meta-Learning with Differentiable Convex Optimization,\"Kwonjoon Lee (UC San Diego)*; Subhransu Maji (University of Massachusetts, Amherst); Avinash Ravichandran (Amazon); Stefano Soatto (AWS Amazon ML)\",Deep Learning,Representation Learning,Oral,3.2.1,2,,,,,,\n4645,Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach,\"Minyoung Kim (SeoulTech, Rutgers University)*; Pritish Sahu (Rutgers University); Behnam Gholami (Rutgers University); Vladimir Pavlovic (Rutgers University)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Statistical Learning\",Oral,3.2.1,2,,,,,,\n6332,Tangent-Normal Adversarial Regularization for Semi-supervised Learning,Bing Yu (Peking University); Jingfeng Wu (Peking University); Jinwen Ma (Peking University); Zhanxing Zhu (Peking University)*,Deep Learning,Representation Learning,Oral,3.2.1,2,,,,,,\n3306,Auto-Encoding Scene Graphs for Descriptive Image Captioning,XU YANG (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Hanwang Zhang (Nanyang Technological University); Jianfei Cai (Nanyang Technological University),Deep Learning,Vision + Language,Oral,3.2.1,3,,,,,,\n6218,\"Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech\",\"Aditya Deshpande (University of Illinois at UC)*; Jyoti Aneja (University of Illinois, Urbana-Champaign); Liwei Wang (Tencent AI Lab); Alexander Schwing (UIUC); David Forsyth (Univeristy of Illinois at Urbana-Champaign)\",Deep Learning,\"Big Data, Large Scale Methods\",Oral,3.2.1,3,,,,,,\n6105,Attention Branch Network: Learning of Attention Mechanism for Visual Explanation,Hiroshi Fukui (Chubu university)*; Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University),Deep Learning,\"Big Data, Large Scale Methods ; Computer Vision Theory; Face, Gesture, and Body Pose ; Recognition:\",Oral,3.2.1,3,,,,,,\n3796,Cascaded Projection: End-to-End Network Compression and Acceleration,Breton L Minnehan (Rochester Institute of Technology)*; Andreas Savakis (Rochester Institute of Technology),Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Oral,3.2.1,4,,,,,,\n5721,DeepCaps : Going Deeper with Capsule Networks,Jathushan Rajasegaran ( University of Moratuwa)*; Vinoj Jayasundara (University of Moratuwa); Sandaru Jayasekara (University of Moratuwa); Hirunima Jayasekara (University of Moratuwa); Ranga Rodrigo (University of Moratuwa); Suranga Seneviratne (University of Sydney),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,3.2.1,4,,,,,,\n6240,DNASNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search,\"Bichen Wu (UC Berkeley)*; Xiaoliang Dai (Princeton University); Peizhao Zhang (Facebook); Yanghan Wang (Facebook); Fei Sun (Facebook); Yiming Wu (Facebook); Yuandong Tian (Facebook); Peter Vajda (Facebook); Yangqing Jia (Facebook); Kurt Keutzer (EECS, UC Berkeley)\",Deep Learning,,Oral,3.2.1,4,,,,,,\n5032,APDrawingGAN: Generating Artistic Portrait Drawings from Face Photos with Hierarchical GANs,Ran Yi (Tsinghua University); Yong-Jin Liu (Tsinghua University)*; Yukun Lai (Cardiff University); Paul Rosin (Cardiff University),Deep Learning,Others,Oral,3.2.1,5,,,,,,\n6431,Constrained Generative Adversarial Networks for Interactive Image Generation,Eric Heim (AFRL/RI)*,Deep Learning,Image and Video Synthesis; Representation Learning,Oral,3.2.1,5,,,,,,\n6807,WarpGAN: Automatic Caricature Generation,Yichun Shi (Michigan State University)*; Debayan Deb (Michigan State University); Anil Jain (Michigan State University),Deep Learning,\"Computer Vision Theory; Face, Gesture, and Body Pose\",Oral,3.2.1,5,,,,,,\n5199,Explainability Methods for Graph Convolutional Neural Networks,\"Phillip Pope (HRL Laboratories, LLC); Soheil Kolouri (HRL Laboratories LLC)*; Mohammad Rostami (HRL Laboratories, LLC); Charles Martin (HRL Laboratories, LLC); Heiko Hoffmann (HRL)\",Deep Learning,Deep Learning,Oral,3.2.1,6,,,,,,\n5502,A Generative Adversarial Density Estimator,Ehsan M Abbasnejad (Adelaide)*; Qinfeng Shi (University of Adelaide); Anton van den Hengel (University of Adelaide); Lingqiao Liu (University of Adelaide),Deep Learning,Representation Learning; Statistical Learning,Oral,3.2.1,6,,,,,,\n5921,SoDeep: a Sorting Deep net to learn ranking loss surrogates,\"Martin Engilberge (Technicolor, Sorbonne universités)*; Louis Chevallier (Technicolor); Patrick Pérez (Valeo.ai); Matthieu Cord (Sorbonne University)\",Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval; Vision + Language\",Oral,3.2.1,6,,,,,,\n4,High-Quality Face Capture Using Anatomical Muscles,Michael H Bao (Stanford University)*; Matthew D Cong (Industrial Light & Magic); Stephane Grabli (Industrial Light & Magic); Ronald Fedkiw (Stanford),\"Face, Gesture, and Body Pose\",3D from Single Image; Physics-based Vision and Shape-from-X; Vision + Graphics,Oral,3.2.2,1,Face & Body,,,,,\n2408,FML: Face Model Learning from Videos,Ayush Tewari (Max Planck Institute for Informatics)*; Florian Bernard (Max Planck Institute for Informatics); Pablo Garrido (Technicolor); Gaurav Bharaj (Technicolor); Mohamed Elgharib (Max Planck Institute for Informatics); Hans-Peter Seidel (Max Planck Institute for Informatics); Patrick Pérez (Valeo.ai); Michael Zollhoefer (Stanford University); Christian Theobalt (MPI Informatik),\"Face, Gesture, and Body Pose\",3D from Single Image; Vision + Graphics,Oral,3.2.2,1,,,,,,\n4483,AdaScale: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations,\"Xiao Zhang (Chinese University of Hong Kong); Rui Zhao (SenseTime Group Limited); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*; Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongsheng Li (Chinese University of Hong Kong)\",\"Face, Gesture, and Body Pose\",\"Biometrics ; Deep Learning ; Recognition: Detection, Categorization, Retrieval; Representation Learn\",Oral,3.2.2,1,,,,,,\n387,3D Hand Shape and Pose Estimation from a Single RGB Image,\"Liuhao Ge (Nanyang Technological University)*; Zhou Ren (Snap Inc.); Yuncheng Li (Snap); Zehao Xue (Snap Inc.); Yingying Wang (Snap Inc.); Jianfei Cai (Nanyang Technological University); Junsong Yuan (\"\"State University of New York at Buffalo, USA\"\")\",\"Face, Gesture, and Body Pose\",3D from Single Image,Oral,3.2.2,2,,,,,,\n647,3D hand shape and pose from images in the wild,Adnane Boukhayma (University of Oxford)*; Rodrigo de Bem (University of Oxford); Philip Torr (University of Oxford),\"Face, Gesture, and Body Pose\",3D from Single Image,Oral,3.2.2,2,,,,,,\n843,Self supervised 3D hand pose estimation,Chengde Wan (ETHZ)*; Thomas Probst (ETH Zurich); Luc Van Gool (ETH Zurich); Angela Yao (National University of Singapore),\"Face, Gesture, and Body Pose\",RGBD sensors and analytics,Oral,3.2.2,2,,,,,,\n1497,CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark,Jiefeng Li (Shanghai Jiao Tong University); Can Wang (SJTU); Hao Zhu (Shanghai Jiao Tong University); Yihuan Mao (Tsinghua Univerisity); Hao-Shu Fang (SJTU); Cewu Lu (Shanghai Jiao Tong University)*,\"Face, Gesture, and Body Pose\",\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Oral,3.2.2,3,,,,,,\n3233,Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in A Triadic Interaction,\"Hanbyul Joo (CMU)*; Tomas Simon (CMU); Mina Cikara (Department of Psychology, Harvard); Yaser Sheikh (CMU)\",\"Face, Gesture, and Body Pose\",Datasets and Evaluation,Oral,3.2.2,3,,,,,,\n6947,\"Synergistic, Part-Based 3D Human Reconstruction In-The-Wild\",Alp Guler (Imperial College London)*; Iasonas Kokkinos (UCL),\"Face, Gesture, and Body Pose\",3D from Single Image; Deep Learning,Oral,3.2.2,3,,,,,,\n2239,Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation,Xipeng Chen (Sun Yat-sen University); Kwan-Yee Lin (Peking university); Wentao Liu (Sensetime); Chen Qian (SenseTime); Liang Lin (Sun Yat-sen University)*,\"Face, Gesture, and Body Pose\",Deep Learning,Oral,3.2.2,4,,,,,,\n3999,In the Wild Human Pose Estimation using Explicit 2D Features and Intermediate 3D Representations,\"Ikhsanul Habibie (Max Planck Institute for Informatics)*; Weipeng Xu (MPII); Dushyant Mehta (MPI Informatics); Gerard Pons-Moll (MPII, Germany); Christian Theobalt (MPI Informatik)\",\"Face, Gesture, and Body Pose\",3D from Single Image; Motion and Tracking,Oral,3.2.2,4,,,,,,\n33,DensePose-Slim: Cheaper Learning from Motion Cues,Natalia Neverova (Facebook AI Research)*; James Thewlis (University of Oxford); Alp Guler (Imperial College London); Iasonas Kokkinos (UCL); Andrea Vedaldi (Oxford University),\"Face, Gesture, and Body Pose\",Datasets and Evaluation; Deep Learning ; Motion and Tracking,Oral,3.2.2,4,,,,,,\n2859,Twin-Cycle Autoencoder: Self-supervised Representation Learning from Entangled Movement for Facial Action Unit Detection,\"Yong Li (Institute of Computing Technology, Chinese Academy of Sciences); Jiabei Zeng (Institute of Computing Technology, Chinese Academy on Sciences)*; Shiguang Shan (Chinese Academy of Sciences); Xilin Chen (China)\",\"Face, Gesture, and Body Pose\",Deep Learning ; Representation Learning,Oral,3.2.2,5,,,,,,\n4558,Combining 3D Morphable Models: A Largescale Face-and-Head Model,\"Stylianos Ploumpis (Imperial College London)*; Haoyang Wang (Imperial College London); Nick E. Pears (University of York, UK); William Smith (University of York); Stefanos Zafeiriou (Imperial College Londong)\",\"Face, Gesture, and Body Pose\",3D from Single Image; Vision + Graphics,Oral,3.2.2,5,,,,,,\n4364,Boosting Local Shape Matching for Dense 3D Face Correspondence,Zhenfeng Fan (The Chinese academy of science)*; hu xiyuan (The Chinese academy of science); Chen Chen (The Chinese academy of science); peng silong (The Chinese academy of science),\"Face, Gesture, and Body Pose\",3D from Multiview and Sensors; 3D from Single Image; Physics-based Vision and Shape-from-X; RGBD sen,Oral,3.2.2,5,,,,,,\n2886,Unsupervised Part-Based Disentangling of Object Shape and Appearance,Dominik Lorenz (Heidelberg University); Leonard Bereska (Heidelberg University); Timo Milbich (Heidelberg University)*; Bjorn Ommer (Heidelberg University),\"Face, Gesture, and Body Pose\",Image and Video Synthesis,Oral,3.2.2,6,,,,,,\n2922,\"Monocular Total Capture: Posing Face, Body, and Hands in the Wild\",Donglai Xiang (Carnegie Mellon University)*; Hanbyul Joo (CMU); Yaser Sheikh (CMU),\"Face, Gesture, and Body Pose\",3D from Single Image; Vision + Graphics,Oral,3.2.2,6,,,,,,\n3128,\"Expressive Body Capture: 3D Hands, Face, and Body from a Single Image\",Georgios Pavlakos (University of Pennsylvania)*; Michael J. Black (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Vasileios Choutas (Max Planck Institute for Intelligent Systems); Nima Ghorbani (Max Planck Institute Tübingen); Ahmed A A Osman (Max Planck Institute for Intelligent Systems); Dimitrios Tzionas (Max Planck Institute for Intelligent Systems),\"Face, Gesture, and Body Pose\",3D from Single Image,Oral,3.2.2,6,,,,,,\n707,Neural RGB -> D Sensing: Depth and Uncertainty from a Video Camera,Chao Liu (Carnegie Mellon University); Jinwei Gu (NVIDIA)*; Kihwan Kim (NVIDIA); Srinivasa G Narasimhan (Carnegie Mellon University); Jan Kautz (NVIDIA),Low-level Vision,3D from Multiview and Sensors; 3D from Single Image; Computational Photography,Oral,3.2.3,1,Low-level & Optimization,,,,,\n1006,DAVANet: Stereo Deblurring with View Aggregation,\"Shangchen Zhou (Sensetime Research)*; Jiawei Zhang (Sensetime Research); Jimmy Ren (SenseTime Research); Wangmeng Zuo (Harbin Institute of Technology, China); Haozhe Xie (Harbin Institute of Technology); Jinshan Pan (Nanjing University of Science and Technology)\",Low-level Vision,Deep Learning ; RGBD sensors and analytics,Oral,3.2.3,1,,,,,,\n3657,DVC: An End-to-end Deep Video Compression Framework,Guo Lu (Shanghai Jiao Tong University)*; Wanli Ouyang (The University of Sydney); Dong Xu (University of Sydney); Chunlei Cai (Shanghai Jiao Tong University); Xiaoyun Zhang (Shanghai Jiao Tong University); Zhiyong Gao (Shanghai Jiao Tong University),Low-level Vision,Vision Applications and Systems,Oral,3.2.3,1,,,,,,\n1098,SOSNet: Second Order Similarity Regularization for Local Descriptor Learning,\"yurun tian (National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences); Xin Yu (Australian National University); Bin Fan (Institute of Automation, Chinese Academy of Sciences, China)*; Fuchao Wu (National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences); Huub Heijnen (Scape Technologies); Vassileios Balntas (Scape Technologies)\",Low-level Vision,\"Recognition: Detection, Categorization, Retrieval\",Oral,3.2.3,2,,,,,,\n2154,“Double-DIP”: Unsupervised Image Decomposition via Coupled Deep-Image-Priors,\"Yosef Gandelsman (Weizmann Institute of Science)*; Assaf Shocher (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel)\",Low-level Vision,Computational Photography; Deep Learning,Oral,3.2.3,2,,,,,,\n2579,Unprocessing Images for Learned Raw Denoising,Tim Brooks (Google)*; Ben Mildenhall (UC Berkeley); Tianfan Xue (MIT); Jiawen Chen (Google); Dillon Sharlet (Google); Jonathan T Barron (Google Research),Low-level Vision,Computational Photography; Deep Learning,Oral,3.2.3,2,,,,,,\n3342,Residual Networks for Light Field Image Super-Resolution,Shuo Zhang (Beijing Jiaotong University)*; Youfang Lin (Beijing Jiaotong University); Hao Sheng (Beihang University),Low-level Vision,3D from Multiview and Sensors; Computational Photography; Deep Learning,Oral,3.2.3,3,,,,,,\n3959,Modulating Image Restoration with Continual Levels via Adaptive Feature Modification Layers,\"Jingwen He (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Chao Dong (SIAT)*; Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)\",Low-level Vision,Deep Learning,Oral,3.2.3,3,,,,,,\n5318,Second-order Attention Network for Single Image Super-resolution,\"Tao Dai (Tsinghua University)*; Jianrui Cai (The Hong Kong Polytechnic University, Hong Kong, China); yongbing zhang (Tsinghua University); Shutao Xia (Tsinghua University); Lei Zhang (\"\"Hong Kong Polytechnic University, Hong Kong, China\"\")\",Low-level Vision,Deep Learning,Oral,3.2.3,3,,,,,,\n2599,Devil is in the Edges: Learning Semantic Boundaries from Noisy Annotations,David Acuna (University of Toronto)*; Amlan Kar (University of Toronto); Sanja Fidler (University of Toronto),\"Segmentation, Grouping and Shape\",Low-level Vision,Oral,3.2.3,4,,,,,,\n3097,Path-Invariant Map Networks,\"Zaiwei Zhang (University of Texas at Austin); Zhenxiao Liang (The University of Texas at Austin); Lemeng Wu (The University of Texas at Austin); Xiaowei Zhou (Zhejiang Univ., China); Qixing Huang (The University of Texas at Austin)*\",Optimization Methods,\"Big Data, Large Scale Methods ; Deep Learning\",Oral,3.2.3,4,,,,,,\n5608,FilterReg: Robust and Efficient Probabilistic Point-Set Registration using Gaussian Filter and Twist Parameterization,Wei Gao (MIT)*; Russ Tedrake (MIT),Low-level Vision,Motion and Tracking; Optimization Methods; RGBD sensors and analytics; Robotics + Driving,Oral,3.2.3,4,,,,,,\n108,Probabilistic Permutation Synchronization using the Riemannian Structure of the Birkhoff Polytope,Tolga Birdal (TU Munich)*; Umut Simsekli (Telecom ParisTech),Optimization Methods,Computer Vision Theory; Low-level Vision; Statistical Learning,Oral,3.2.3,5,,,,,,\n190,Lifting Vectorial Variational Problems: A Natural Formulation based on Geometric Measure Theory and Discrete Exterior Calculus,Thomas Möllenhoff (Technical University of Munich)*; Daniel Cremers (TUM),Optimization Methods,,Oral,3.2.3,5,,,,,,\n1428,A Sufficient Condition for Convergences of Adam and RMSProp,Fangyu Zou (stonybrook); Li Shen (Tencent AI Lab)*; Zequn Jie (Tencent AI Lab); Weizhong Zhang (Tencent AI Lab); Wei Liu (Tencent),Optimization Methods,\"Big Data, Large Scale Methods\",Oral,3.2.3,5,,,,,,\n5959,Guaranteed Matrix Completion under Multiple Linear Transformations,Chao Li (RIKEN)*; Wei He (RIKEN AIP); Longhao Yuan (Saitama Institute of Technology/RIKEN AIP); Zhun Sun (RIKEN Center for AIP); Qibin Zhao (RIKEN),Low-level Vision,Computational Photography; Statistical Learning,Oral,3.2.3,6,,,,,,\n4802,MAP inference via Block-Coordinate Frank-Wolfe Algorithm,\"Paul Swoboda (MPI fuer Informatik, Saarbruecken)*; Vladimir Kolmogorov (Institute of Science and Technology, Austria)\",Optimization Methods,,Oral,3.2.3,6,,,,,,\n5321,A convex relaxation for multi-graph matching,\"Paul Swoboda (MPI fuer Informatik, Saarbruecken)*; Ashkan Mokarian (BIH/MDC); Dagmar Kainmueller (BIH/MDC); Christian Theobalt (MPI Informatik); Florian Bernard (Max Planck Institute for Informatics)\",Optimization Methods,\"Segmentation, Grouping and Shape\",Oral,3.2.3,6,,,,,," }, { "path": "cvpr_2019_poster.csv", "content": "Paper ID,Paper Title,Author Names,Primary Subject Area,Secondary Subject Areas,Status,Day.Time,Poster #,Topic,,Topic Legend,\n5,Finding Task-Relevant Features for Few-Shot Learning by Category Traversal,\"Hongyang Li (The Chinese University of Hong Kong)*; David Eigen (Clarifai Inc.); Samuel Dodge (Clarifai Inc.); Matt Zeiler (Clarifai Inc.); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",Deep Learning,,Oral,1.1,1,A,,A,Deep Learning\n6340,Edge-Labeling Graph Neural Network for Few-shot Learning,Jongmin Kim (KAIST)*; Taesup Kim (Université de Montréal); Sungwoong Kim (Kakao Brain); Chang D. Yoo (KAIST),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning; Segmentation, Grouping a\",Oral,1.1,2,A,,B,Recognition\n5728,Generating Classification Weights with Graph Neural Networks for Few-Shot Learning,\"Spyros Gidaris (valeo.ai)*; Nikos Komodakis (\"\"ENPC, France\"\")\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,1.1,3,A,,C,\"Seg, Grouping, Shape\"\n257,Kervolutional Neural Networks,\"Chen Wang (Nanyang Technological University)*; JIANFEI YANG (Nanyang Technological University); Prof. Dr. Respected Colleauge (IJCAS Editorial Member); Junsong Yuan (\"\"State University of New York at Buffalo, USA\"\")\",Deep Learning,Computer Vision Theory,Oral,1.1,4,A,,D,\"Stats, Physics, Theory, & Datasets\"\n4863,Why ReLu networks yield high-confidence predictions far away from the training data and how to mitigate the problem,Matthias Hein (University of Tuebingen)*; Maksym Andriushchenko (Saarland University); Julian Bitterwolf (University of Tuebingen),Deep Learning,Statistical Learning,Oral,1.1,5,A,,E,3D Multiview\n6679,On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions,Yusuke Tsuzuku (The University of Tokyo/RIKEN)*; Issei Sato (The university of Tokyo/RIKEN),Deep Learning,Others,Oral,1.1,6,A,,F,3D Single View & RGBD\n948,Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization,Siyuan Qiao (Johns Hopkins University)*; Zhe Lin (Adobe Research); Jianming Zhang (Adobe Research); Alan Yuille (Johns Hopkins University),Deep Learning,,Oral,1.1,7,A,,G,Face & Body\n2284,Hardness-Aware Deep Metric Learning,Wenzhao Zheng (Tsinghua University); Zhaodong Chen (Tsinghua University); Jiwen Lu (Tsinghua University)*; Jie Zhou (Tsinghua University),Deep Learning,Representation Learning,Oral,1.1,8,A,,H,Action & Video\n1183,Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation,Chenxi Liu (Johns Hopkins University)*; Liang-Chieh Chen (Google Inc.); Florian Schroff (Google Inc.); Hartwig Adam (Google); Wei Hua (Google); Alan Yuille (Johns Hopkins University); Li Fei-Fei (Stanford University),Deep Learning,\"Segmentation, Grouping and Shape\",Oral,1.1,9,A,,I,Motion & Biometrics\n1535,Learning to Learn Loss for Active Learning,Donggeun Yoo (Lunit)*; In So Kweon (KAIST),Deep Learning,\"Others; Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Oral,1.1,10,A,,J,Synthesis\n2230,Striking the Right Balance with Uncertainty,Salman Khan (Australian National University (ANU)); Munawar Hayat (University of Canberra); Waqas Zamir (IIAI); Jianbing Shen (Beijing Institute of Technology)*; Ling Shao (Inception Institute of Artificial Intelligence),Deep Learning,Low-level Vision,Oral,1.1,11,A,,K,Comp. Photography & Graphics\n2368,AutoAugment: Learning Augmentation Strategies from Data,Ekin D Cubuk (Google Brain)*; Barret Zoph (Google); Dandelion Mane (Protocol Labs); Vijay Vasudevan (Google Brain); Quoc Le (Google Brain),Deep Learning,,Oral,1.1,12,A,,L,Low-level & Optimization\n143,Parsing R-CNN for Instance-Level Human Analysis,Lu Yang (Beijing University of Posts and Telecommunications); Qing Song (Beijing University of Posts and Telecommunications)*; Zhihui Wang (Beijing University of Posts and Telecommunications); Ming Jiang (WiWide Inc.),Deep Learning,\"Face, Gesture, and Body Pose ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grou\",Poster,1.1,13,A,,M,Scenes and Representation\n306,Large Scale Incremental Learning,Yue Wu (northeastern university)*; Yinpeng Chen (Microsoft); Lijuan Wang (Microsoft); Yuancheng Ye (CCNY); Zicheng Liu (Microsoft); Yandong Guo (MSR); YUN FU (Northeastern University),Deep Learning,\"Big Data, Large Scale Methods ; Face, Gesture, and Body Pose\",Poster,1.1,14,A,,N,Language & Reasoning\n365,Structural Point Cloud Decoder,\"Lyne P Tchapmi (Stanford University)*; Hamid Rezatofighi (University of Adelaide); Vineet S Kosaraju (Stanford Vision & Learning Lab); Ian Reid (\"\"University of Adelaide, Australia\"\"); Silvio Savarese (Stanford University)\",Deep Learning,3D from Multiview and Sensors; 3D from Single Image; Physics-based Vision and Shape-from-X; RGBD sen,Poster,1.1,15,A,,O,\"Applications, Medical, Robotics\"\n423,Perceive Where to Focus: Learning Visibility-aware Part-level Features for Partial Person Re-identification,Yifan Sun (Tsinghua University); Ya-Li Li (THU); Qin Xu (Tsinghua University); Chi Zhang (Megvii Inc.); Yikang Li (CUHK); Shengjin Wang (Tsinghua University)*; Jian Sun (Megvii Technology),Deep Learning,Representation Learning,Poster,1.1,16,A,,,\n454,Meta-Transfer Learning for Few-Shot Learning,Qianru Sun (MPI)*; Yaoyao Liu (Tianjin University); Tat-Seng Chua (National Univ. of Singapore); Bernt Schiele (MPI Informatics),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,17,A,,,\n635,Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation,\"Bohan Zhuang (The University of Adelaide)*; Mingkui Tan (South China University of Technology); Lingqiao Liu (University of Adelaide); Chunhua Shen (University of Adelaide); Ian Reid (\"\"University of Adelaide, Australia\"\")\",Deep Learning,\"Representation Learning; Segmentation, Grouping and Shape\",Poster,1.1,18,A,,,\n686,Deep RNN Framework for Visual Sequential Applications,Bo Pang (Shanghai Jiao Tong University); Kaiwen Zha (Shanghai Jiao Tong University); Hanwen Cao (Shanghai Jiao Tong University); Chen Shi (Shanghai Jiao Tong University); Cewu Lu (Shanghai Jiao Tong University)*,Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Video Analytics\",Poster,1.1,19,A,,,\n750,Graph-Based Global Reasoning Networks,Yunpeng Chen (National University of Singapore); Yannis Kalantidis (Facebook Research)*; Marcus Rohrbach (Facebook AI Research); Zhicheng Yan (Facebook AI); Yan Shuicheng (National University of Singapore); Jiashi Feng (NUS),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,1.1,20,A,,,\n778,SSN: Learning Sparse Switchable Normalization via SparsestMax,\"Wenqi Shao (The Chinese University of HongKong)*; Tianjian Meng (University of Pittsburgh); Jingyu Li (SenseTime Research); Ruimao Zhang (The Chinese University of Hong Kong); Yudian Li (SenseTime); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Ping Luo (The Chinese University of Hong Kong)\",Deep Learning,\"Action Recognition ; Optimization Methods; Recognition: Detection, Categorization, Retrieval; Vision\",Poster,1.1,21,A,,,\n823,Spherical Fractal Convolution Neural Networks for Point Cloud Recognition,Yongming Rao (Tsinghua University); Jiwen Lu (Tsinghua University)*; Jie Zhou (Tsinghua University),Deep Learning,\"3D from Multiview and Sensors; Recognition: Detection, Categorization, Retrieval; Representation Lea\",Poster,1.1,22,A,,,\n830,Task-Aware Synthetic Data Generation,Shashank Tripathi (Carnegie Mellon University); Siddhartha Chandra (Amazon); Ambrish Tyagi (Amazon); James Rehg (Georgia Institute of Technology); Visesh Chari (Amazon Lab126)*; Amit Agrawal (Amazon); Kris Kitani (CMU),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding\",Poster,1.1,23,A,,,\n833,Divide and Conquer the Embedding Space for Metric Learning,Artsiom O Sanakoyeu (Heidelberg University)*; Vadim Tschernezki (Heidelberg University); Uta Büchler (Heidelberg University); Bjorn Ommer (Heidelberg University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,1.1,24,A,,,\n858,Latent Space Autoregression for Novelty Detection,\"Davide Abati (University of Modena and Reggio Emilia)*; Angelo Porrello (University of Modena and Reggio Emilia); SIMONE CALDERARA (University of Modena and Reggio Emilia, Italy); Rita Cucchiara (Universita Di Modena E Reggio Emilia)\",Deep Learning,Representation Learning; Scene Analysis and Understanding; Video Analytics; Vision Applications and,Poster,1.1,25,A,,,\n934,Attending to Discriminative Certainty for Domain Adaptation,Vinod Kumar Kurmi (IIT Kanpur)*; Shanu Kumar (IIT Kanpur); Vinay P Namboodiri (IIT Kanpur),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,26,A,,,\n1061,Feature Denoising for Improving Adversarial Robustness,Cihang Xie ( Johns Hopkins University ); Yuxin Wu (Facebook AI Research); Laurens van der Maaten (Facebook); Alan Yuille (Johns Hopkins University); Kaiming He (Facebook AI Research)*,Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,27,A,,,\n1065,Selective Kernel Networks,Xiang Li (Nanjing University of Science and Technology)*; Xiaolin Hu (Tsinghua University); Wenhai Wang (Nanjing university); Jian Yang (Nanjing University of Science and Technology),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,28,A,,,\n1146,On Implicit Filter Level Sparsity in Convolutional Neural Networks,Dushyant Mehta (MPI Informatics)*; Kwang In Kim (UNIST); Christian Theobalt (MPI Informatik),Deep Learning,Optimization Methods; Representation Learning,Poster,1.1,29,A,,,\n1167,FlowNet3D: Learning Scene Flow in 3D Point Clouds,Xingyu Liu (Stanford University)*; Charles R. Qi (Facebook AI Research); Leonidas Guibas (Stanford University),Deep Learning,Low-level Vision; Motion and Tracking; Robotics + Driving,Poster,1.1,30,A,,,\n1218,Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks,Kuan Fang (Stanford Univeristy); Alexander Toshev (Google)*; Silvio Savarese (Stanford University); Li Fei-Fei (Stanford University),Deep Learning,Robotics + Driving,Poster,1.1,31,A,,,\n1222,Co-occurrent Features in Semantic Segmentation,Hang Zhang (Amazon Inc)*; Han Zhang (Google); Chenguang Wang (Amazon AI); Junyuan Xie (Amazon),Deep Learning,\"Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,1.1,32,A,,,\n1246,Bag of Tricks to Train Convolutional Neural Networks for Image Classification,Junyuan Xie (Amazon); Tong He (Amazon)*; Zhi Zhang (Amazon); Hang Zhang (Amazon Inc); Zhongyue Zhang (Amazon); Mu Li (Amazon),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,33,A,,,\n1279,Learning Channel-wise Interactions for Binary Convolutional Neural Networks,Ziwei Wang (Tsinghua University); Jiwen Lu (Tsinghua University)*; Chenxin Tao (Tsinghua University); Jie Zhou (Tsinghua University); Qi Tian (Huawei Noah’s Ark Lab),Deep Learning,,Poster,1.1,34,A,,,\n1290,Knowledge Translation and Adaptation for Efficient Semantic Segmentation,Tong He (The University of Adelaide); Chunhua Shen (University of Adelaide)*; Zhi Tian (The University of Adelaide); Dong Gong (The University of Adelaide); Youliang Yan (Huawei); Changming Sun (CSIRO Data61),Deep Learning,\"Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,1.1,35,A,,,\n7000,Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness against Adversarial Attack,Zhezhi He (University of Central Florida)*; Adnan Siraj Rakin (University of Central Florida); Deliang Fan (University of Central Florida),Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,1.1,36,A,,,\n17,Invariance Matters: Exemplar Memory for Domain Adaptive Person Re-identification,\"Zhun Zhong (Xiamen University)*; Liang Zheng (Australian National University); Zhiming Luo (Xiamen University); Shaozi Li (Xiamen University, China); Yi Yang (UTS)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.1,37,B,,,\n19,Dissecting Person Re-identification from the Viewpoint of Viewpoint,Xiaoxiao Sun (Singapore University of Technology and Design)*; Liang Zheng (Australian National University),\"Recognition: Detection, Categorization, Retrieval\",Datasets and Evaluation,Poster,1.1,38,B,,,\n158,Learning to Reduce Dual-level Discrepancy for Infrared-Visible Person Re-identification,Zhixiang Wang (National Taiwan University); Zheng Wang (National Institute of Informatics)*; Yinqiang Zheng (National Institute of Informatics); Yung-Yu Chuang (National Taiwan University); Shin'ichi Satoh (National Institute of Informatics),\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose\",Poster,1.1,39,B,,,\n208,Progressive Feature Alignment for Unsupervised Domain Adaptation,Chaoqi Chen (Xiamen University); Weiping Xie (Xiamen University); Tingyang Xu (Tencent AI Lab); Wenbing Huang (Tencent AI Lab); Yu Rong (Tencent AI Lab); Xinghao Ding (Xiamen University); Yue Huang (Xiamen University)*; Junzhou Huang (University of Texas at Arlington),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,1.1,40,B,,,\n278,Feature-level Frankenstein: Eliminating Variations for Discriminative Recognition,\"Xiaofeng Liu (CMU)*; B.V.K. Kumar (CMU); Site Li (Carnegie Mellon University); Ping Jia (Changchun Institute of Optics, Fine Mechanics and Physics); Wanqing Xie (Harvard Medical School); Jane You (HK Poly U); Lingsheng Kong (Changchun Institute of Optics, Fine Mechanics and Physics)\",\"Recognition: Detection, Categorization, Retrieval\",Biometrics,Poster,1.1,41,B,,,\n300,Learning a Deep ConvNet for Multi-label Classification with Partial Labels,Thibaut Durand (Simon Fraser University)*; Nazanin Mehrasa (Simon Fraser University); Greg Mori (Simon Fraser University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.1,42,B,,,\n366,Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression,\"Hamid Rezatofighi (University of Adelaide)*; Nathan Tsoi (Stanford University); JunYoung Gwak (Stanford University); Amir A Sadeghian (Stanford); Ian Reid (\"\"University of Adelaide, Australia\"\"); Silvio Savarese (Stanford University)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.1,43,B,,,\n677,Densely Semantically Aligned Person Re-Identification,Zhizheng Zhang (University of Science and Technology of China); Cuiling Lan (Microsoft Research)*; Wenjun Zeng (Microsoft Research); Zhibo Chen (University of Science and Technology of China),\"Recognition: Detection, Categorization, Retrieval\",Representation Learning; Vision Applications and Systems,Poster,1.1,44,B,,,\n698,Generalising Fine-Grained Sketch-Based Image Retrieval,Kaiyue Pang (Queen Mary University of London)*; Ke Li (BUPT); Yongxin Yang (University of Edinburgh ); Honggang Zhang (Beijing University of Posts and Telecommunications); Yi-Zhe Song (Queen Mary University of London); Tao Xiang (University of Surrey); Timothy Hospedales (Edinburgh University),\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods ; Deep Learning ; Vision Applications and Systems\",Poster,1.1,45,B,,,\n764,Adapting Object Detectors via Selective Cross-Domain Alignment,Xinge Zhu (The Chinese University of Hong Kong)*; Jiangmiao Pang (Zhejiang University); Ceyuan Yang (Chinese University of Hong Kong); Jianping Shi (Sensetime Group Limited); Dahua Lin (The Chinese University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",\"Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,1.1,46,B,,,\n774,Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation,\"Yunhang Shen (Xiamen University); Rongrong Ji (Xiamen University, China)*; Yan Wang (Microsoft); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Liujuan Cao (Xiamen University)\",\"Recognition: Detection, Categorization, Retrieval\",,Poster,1.1,47,B,,,\n802,Thinking Outside the Pool: Active Training Image Creation for Relative Attributes,Aron Yu (University of Texas at Austin)*; Kristen Grauman (Facebook AI Research & UT Austin),\"Recognition: Detection, Categorization, Retrieval\",Image and Video Synthesis,Poster,1.1,48,B,,,\n848,Generalizable Person Re-identification by Domain-Invariant Mapping Network,\"Jifei Song (Queen Mary, University of London)*; Yongxin Yang (University of Edinburgh ); Yi-Zhe Song (Queen Mary University of London); Tao Xiang (University of Surrey); Timothy Hospedales (Edinburgh University)\",\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods ; Datasets and Evaluation; Deep Learning ; Face, Gesture, and Body Pos\",Poster,1.1,49,B,,,\n851,Visual Attention Consistency under Image Transforms for Multi-label Image Classification,Hao Guo (University of South Carolina)*; Kang Zheng (University of South Carolina); Xiaochuan Fan (University of South Carolina); Hongkai Yu (University of Texas - Rio Grande Valley); Song Wang (University of South Carolina),\"Recognition: Detection, Categorization, Retrieval\",,Poster,1.1,50,B,,,\n866,Re-ranking via Metric Fusion for Object Retrieval and Person Re-identification,Song Bai (University of Oxford)*; Peng Tang (Huazhong University of Science and Technology); Longin Jan Latecki (Temple University); Philip Torr (University of Oxford),\"Recognition: Detection, Categorization, Retrieval\",\"3D from Multiview and Sensors; Representation Learning; Segmentation, Grouping and Shape\",Poster,1.1,51,B,,,\n956,Unsupervised Domain Adaptation by Semantic Discrepancy Minimization,\"Junbao Zhuo (ICT CAS); Shuhui Wang (VIPL,ICT,Chinese academic of science)*; Shuhao Cui ( ICT CAS); Qingming Huang (University of Chinese Academy of Sciences)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning; Visual Reasoning,Poster,1.1,52,B,,,\n964,Weakly Supervised Person Re-Identification,\"Jingke Meng (Sun Yat-Sun University); Sheng Wu (Sen Yat-Sun University); WEI-SHI ZHENG (Sun Yat-sen University, China)*\",\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose ; Scene Analysis and Understanding\",Poster,1.1,53,B,,,\n993,PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud,\"Shaoshuai Shi (The Chinese University of Hong Kong)*; Hongsheng Li (Chinese University of Hong Kong); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",\"Recognition: Detection, Categorization, Retrieval\",3D from Multiview and Sensors,Poster,1.1,54,B,,,\n1023,Automatic adaptation of object detectors to new domains using self-training,\"Aruni RoyChowdhury (University of Massachusetts, Amherst)*; Prithvijit Chakrabarty (University of Massachusetts, Amherst); Ashish Singh (UMASS Amherst); SouYoung Jin (UMASS Amherst); Huaizu Jiang (UMass Amherst); Liangliang Cao (UMass Amherst); Erik Learned-Miller (University of Massachusetts, Amherst)\",\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods ; Face, Gesture, and Body Pose ; Motion and Tracking; Robotics + Drivi\",Poster,1.1,55,B,,,\n1050,Deep Sketch-Shape Hashing with Segmented 3D Stochastic Viewing,Jiaxin Chen (Inception Institute of Artificial Intelligence)*; Jie Qin (Inception Institute of Artificial Intelligence); Li Liu (the inception institute of artificial intelligence); Fan Zhu (Inception Institute of Artificial Intelligence); Fumin Shen (UESTC); Jin Xie (Nanjing University of Science and Technology); Ling Shao (Inception Institute of Artificial Intelligence),\"Recognition: Detection, Categorization, Retrieval\",3D from Multiview and Sensors; Deep Learning,Poster,1.1,56,B,,,\n1062,Generative Dual Adversarial Network for Generalized Zero-shot Learning,He Huang (University of Illinois at Chicago)*; Changhu Wang (ByteDance.Inc); Philip S Yu (UIC); Chang-Dong Wang (Sun Yat-sen University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.1,57,B,,,\n1115,Query-guided End-to-End Person Search,\"Bharti Munjal (OSRAM)*; Sikandar Amin (OSRAM GmbH); Federico Tombari (Technical University of Munich, Germany); Fabio Galasso (OSRAM)\",\"Recognition: Detection, Categorization, Retrieval\",Scene Analysis and Understanding,Poster,1.1,58,B,,,\n1128,Libra R-CNN: Balanced Learning for Object Detection,Jiangmiao Pang (Zhejiang University)*; Kai Chen (The Chinese University of Hong Kong); Jianping Shi (Sensetime Group Limited); Wanli Ouyang (The University of Sydney); Dahua Lin (The Chinese University of Hong Kong); Huajun Feng (Zhejiang Univerisity),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.1,59,B,,,\n1165,Learning a Unified Classifier Incrementally via Rebalancing,\"Saihui Hou (University of Science and Technology of China)*; Xinyu Pan (MMLAB, CUHK); Chen Change Loy (Nanyang Technological University); Zilei Wang (University of Science and Technology of China); Dahua Lin (The Chinese University of Hong Kong)\",\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Poster,1.1,60,B,,,\n1260,Feature Selective Anchor-Free Module for Single-Shot Object Detection,Chenchen Zhu (Carnegie Mellon University)*; Yihui He (Carnegie Mellon University); Marios Savvides (Carnegie Mellon University),\"Recognition: Detection, Categorization, Retrieval\",,Poster,1.1,62,B,,,\n1324,Bottom-up Object Detection by Grouping Extreme and Center Points,Xingyi Zhou (The University of Texas at Austin)*; Jiacheng Zhuo (The University of Texas at Austin); Philipp Kraehenbuehl (UT Austin),\"Recognition: Detection, Categorization, Retrieval\",\"Deep Learning ; Face, Gesture, and Body Pose ; Segmentation, Grouping and Shape\",Poster,1.1,63,B,,,\n6349,Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples,Zihao Liu (Florida International University)*; Tao Liu (Florida International University); Qi Liu (Florida International University ); Nuo Xu (Florida International University); Xue Lin (Northeastern University); Yanzhi Wang (Northeastern University); Wujie Wen (Florida International University),\"Recognition: Detection, Categorization, Retrieval\",Others,Poster,1.1,64,B,,,\n68,SCOPS: Self-Supervised Co-Part Segmentation,\"Wei-Chih Hung (University of California, Merced)*; Varun Jampani (Nvidia Research); Sifei Liu (NVIDIA); Pavlo Molchanov (NVIDIA); Ming-Hsuan Yang (University of California at Merced); Jan Kautz (NVIDIA)\",\"Segmentation, Grouping and Shape\",Deep Learning ; Scene Analysis and Understanding,Poster,1.1,65,C,,,\n838,Unsupervised Moving Object Detection via Contextual Information Separation,\"Yanchao Yang (UCLA)*; Antonio Loquercio (ETH / University of Zurich); Davide Scaramuzza (University of Zurich & ETH Zurich, Switzerland); Stefano Soatto (UCLA)\",\"Segmentation, Grouping and Shape\",Representation Learning; Scene Analysis and Understanding; Statistical Learning,Poster,1.1,66,C,,,\n1291,Pose2Seg: Detection Free Human Instance Segmentation,Song-Hai Zhang (Tsinghua University); Ruilong Li (Tsinghua University)*; Xin Dong (Tsinghua University); Paul Rosin (Cardiff University); Zixi Cai (Tsinghua University); Xi Han (Tsinghua University); Dingcheng Yang (Tsinghua University); Hao-Zhi Huang (Tencent AI Lab); Shimin Hu (Tsinghua University),\"Segmentation, Grouping and Shape\",\"Face, Gesture, and Body Pose\",Poster,1.1,67,C,,,\n156,DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios,Guorun Yang (Tsinghua University)*; Xiao Song (Sensetime Group Limited); Chaoqin Huang (Shanghai Jiao Tong University); Zhidong Deng (Tsinghua University); Jianping Shi (Sensetime Group Limited); Bolei Zhou (CUHK),Datasets and Evaluation,3D from Multiview and Sensors; Low-level Vision; Robotics + Driving,Poster,1.1,68,D,,,\n177,PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding,Kaichun Mo (Stanford)*; Shilin Zhu (UCSD); Angel X Chang (Eloquent Labs); Li Yi (Stanford); Subarna Tripathi (Intel AI Lab); Leonidas Guibas (Stanford University); Hao Su (UCSD),Datasets and Evaluation,\"Big Data, Large Scale Methods ; Deep Learning ; Recognition: Detection, Categorization, Retrieval; S\",Poster,1.1,69,D,,,\n503,A Dataset and Benchmark for Large-scale Multi-modal Face Anti-Spoofing,\"Shifeng Zhang (CBSR, NLPR, CASIA)*; Xiaobo Wang (JD AI Research); Ajian Liu (MUST); Chenxu Zhao (JD AI Research); Jun Wan (NLPR, CASIA); Sergio Escalera (Computer Vision Center (UAB) & University of Barcelona,); Hailin Shi (JD AI Research); Zezheng Wang (Jingdong Finance); Stan Li (National Lab. of Pattern Recognition, China)\",Datasets and Evaluation,Biometrics,Poster,1.1,70,D,,,\n1047,Unsupervised Learning of Consensus Maximization for 3D Vision Problems,Thomas Probst (ETH Zurich)*; Danda Pani Paudel (ETH Zürich); Ajad Chhatkuli (ETH Zurich); Luc Van Gool (ETH Zurich),Computer Vision Theory,Deep Learning,Poster,1.1,71,D,,,\n1257,Detecting Private Information and Its Purpose in Pictures Taken by Blind People,\"Danna Gurari (University of Texas at Austin); Qing Li (University of California, Los Angeles); Chi Lin (University of Texas at Austin); Yinan Zhao (University of Texas at Austin)*; Anhong Guo (Carnegie Mellon University); Abigale Stangl (University of Colorado Boulder); Jeffrey Bigham (Carnegie Mellon University)\",Datasets and Evaluation,Vision + Language ; Vision Applications and Systems,Poster,1.1,72,D,,,\n494,SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration without Correspondences,Huu Minh Le (Queensland University of Technology)*; Thanh-Toan Do (The University of Liverpool); Tuan NA Hoang (Singapore University of Technology and Design); Ngai-Man Cheung (Singapore University of Technology and Design),3D from Multiview and Sensors,,Oral,1.1,73,E,,,\n2315,BAD SLAM: Bundle Adjusted Direct RGB-D SLAM,Thomas Schöps (ETH Zurich)*; Torsten Sattler (Chalmers University of Technology); Marc Pollefeys (ETH Zurich / Microsoft),3D from Multiview and Sensors,Datasets and Evaluation; RGBD sensors and analytics,Oral,1.1,74,E,,,\n2286,Revealing Scenes by Inverting Structure from Motion Reconstructions,Francesco Pittaluga (University of Florida)*; Sanjeev J Koppal (University of Florida); Sing Bing Kang (Microsoft Research); Sudipta Sinha (Microsoft Research),3D from Multiview and Sensors,Deep Learning ; Image and Video Synthesis,Oral,1.1,75,E,,,\n1185,Strand-accurate Multi-view Hair Capture,Giljoo Nam (KAIST)*; Chenglei Wu (Facebook Reality Labs); Min H. Kim (KAIST); Yaser Sheikh (Facebook Reality Labs),3D from Multiview and Sensors,Vision + Graphics,Oral,1.1,76,E,,,\n6756,DeepSDF: Learning Continuous Signed Distance Functionsfor Shape Representation,Jeong Joon Park (University of Washington)*; Peter R Florence (MIT); Julian Straub (Facebook Reality Labs); Richard Newcombe (Facebook); Steven Lovegrove (Facebook),Deep Learning,3D from Multiview and Sensors; Representation Learning; Vision + Graphics,Oral,1.1,77,E,,,\n2957,Pushing the Boundaries of View Extrapolation with Multiplane Images,Pratul Srinivasan (UC Berkeley)*; Richard Tucker (Google); Jonathan T Barron (Google Research); Ravi Ramamoorthi (University of California San Diego); Ren Ng (UC Berkeley); Noah Snavely (Cornell University and Google AI),3D from Multiview and Sensors,Computational Photography; Image and Video Synthesis; Vision + Graphics,Oral,1.1,78,E,,,\n1935,GA-Net: Guided Aggregation Net for End-to-end Stereo Matching,Feihu Zhang (University of Oxford)*; Victor Adrian Prisacariu (University of Oxford); Yang Ruigang (Baidu); Philip Torr (University of Oxford),3D from Multiview and Sensors,Deep Learning,Oral,1.1,79,E,,,\n2901,Real-time self-adaptive deep stereo,Alessio Tonioni (University of Bologna); Fabio Tosi (University of Bologna); Matteo Poggi (University of Bologna)*; Stefano Mattoccia (University of Bologna); Luigi Di Stefano (University of Bologna),3D from Multiview and Sensors,Deep Learning ; Low-level Vision,Oral,1.1,80,E,,,\n6639,LAF-Net: Locally Adaptive Fusion Networks for Stereo Confidence Estimation,Sunok Kim (Yonsei University); Seungryong Kim (Yonsei University); Dongbo Min (Ewha Womans University); Kwanghoon Sohn (Yonsei Univ.)*,3D from Multiview and Sensors,Others,Oral,1.1,81,E,,,\n3522,NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences,Chen Zhao (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.); chi li (Huazhong University of Science and Technology); Xin Li (West Virginia University); Jiaqi Yang (Huazhong Univ. of Sci.&Tech.)*,3D from Multiview and Sensors,Representation Learning,Oral,1.1,82,E,,,\n5852,Coordinate-Free Carlsson-Weinshall Duality and Relative Multi-View Geometry,Matthew Trager (NYU)*; Martial Hebert (Carnegie Mellon University); Jean Ponce (Inria),3D from Multiview and Sensors,Computer Vision Theory,Oral,1.1,83,E,,,\n1944,Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image,\"Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen))*; Zhaoxuan Zhang (Dalian University of Technology, Shenzhen Research Institute of Big Data); Dong Du (University of Science and Technology of China, Shenzhen Research Institute of Big Data); Mingdai Yang (Chinese University of Hong Kong, Shenzhen); Jingming Yu (Alibaba); Pan Pan (Alibaba Group); Xin Yang (Dalian University of Technology); Ligang Liu (University of Science and Technology of China); Zixiang Xiong (Texas A&M University); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen )\",3D from Multiview and Sensors,Deep Learning,Oral,1.1,84,E,,,\n476,Structural Relational Reasoning of Point Clouds,Yueqi Duan (Tsinghua University); Jiwen Lu (Tsinghua University)*; Yu Zheng (Tsinghua University); Jie Zhou (Tsinghua University); Qi Tian (Huawei Noah’s Ark Lab),3D from Multiview and Sensors,Visual Reasoning,Poster,1.1,85,E,,,\n666,Morphable Model-based Multi-view 3D Face Reconstruction with Convolutional Neural Networks,\"Fanzi Wu ( The Chinese University of Hong Kong); Linchao Bao (Tencent AI Lab)*; Yonggen Ling (Tencent AI Lab); Yibing Song (Tencent AI Lab); Yajing Chen (Shanghai Jiao Tong University); Songnan Li (The Chinese University of Hong Kong); King N. Ngan (CUHK, UESTC); Wei Liu (Tencent)\",3D from Multiview and Sensors,\"Face, Gesture, and Body Pose\",Poster,1.1,86,E,,,\n787,Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction,\"Chen-Hsuan Lin (Carnegie Mellon University)*; Bryan Russell (Adobe Research); Oliver Wang (Adobe Systems Inc); Vladimir Kim (Adobe); Simon Lucey (CMU); Matthew Fisher (Adobe Research); Eli Shechtman (Adobe Research, US)\",3D from Multiview and Sensors,,Poster,1.1,87,E,,,\n1011,Guided Stereo Matching,Matteo Poggi (University of Bologna)*; Davide Pallotti (University of Bologna); Fabio Tosi (University of Bologna); Stefano Mattoccia (University of Bologna),3D from Multiview and Sensors,Deep Learning ; Low-level Vision,Poster,1.1,88,E,,,\n1191,\"Unsupervised Event-based Learning of Optical Flow, Depth, and Egomotion\",Alex Z Zhu (University of Pennsylvania)*; Liangzhe Yuan (University of Pennsylvania); Kenneth Chaney (University of Pennsylvania); Kostas Daniilidis (University of Pennsylvania),3D from Multiview and Sensors,Deep Learning ; Motion and Tracking,Poster,1.1,89,E,,,\n1221,Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN,\"Shiyi Lan (University of Maryland)*; Ruichi Yu (University of Maryland, College Park); Gang Yu (Face++); Larry Davis (University of Maryland)\",3D from Multiview and Sensors,\"Deep Learning ; RGBD sensors and analytics; Segmentation, Grouping and Shape\",Poster,1.1,90,E,,,\n155,3D Point-Capsule Networks,\"Tolga Birdal (TU Munich)*; YongHeng Zhao (University of Padova); Haowen Deng (Technical University of Munich); Federico Tombari (Technical University of Munich, Germany)\",RGBD sensors and analytics,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Representation Learning; Robotics\",Poster,1.1,91,F,,,\n226,GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving,\"Buyu LI (The Chinese University of Hong Kong)*; Wanli Ouyang (The University of Sydney); Lu Sheng (The Chinese University of Hong Kong); Xingyu ZENG (SenseTime Group Limited); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",3D from Single Image,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,92,F,,,\n348,Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding,Zehao Yu (ShanghaiTech University); Jia Zheng (ShanghaiTech University); Dongze Lian (Shanghaitech University); Zihan Zhou (Penn State University); Shenghua Gao (Shanghaitech University)*,3D from Single Image,\"Segmentation, Grouping and Shape\",Poster,1.1,93,F,,,\n593,3DN: 3D Deformation Network,Weiyue Wang (USC)*; Duygu Ceylan (Adobe Research); Radomir Mech (Adobe Systems Incorporated); Ulrich Neumann (USC),3D from Single Image,Vision + Graphics,Poster,1.1,94,F,,,\n1073,HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation,Cheng Sun (National Tsing Hua University)*; Chi-Wei Hsiao (National Tsing Hua University); Min Sun (NTHU); Hwann-Tzong Chen (National Tsing Hua University),3D from Single Image,Deep Learning,Poster,1.1,95,F,,,\n1077,Deep Relational Reasoning Network for Monocular 3D Object Detection,Lijie Liu (Tsinghua University); Jiwen Lu (Tsinghua University)*; Chunjing Xu (Huawei Noah's Ark Lab); Qi Tian (Huawei Noah’s Ark Lab); Jie Zhou (Tsinghua University),3D from Single Image,Deep Learning ; Robotics + Driving; Scene Analysis and Understanding; Visual Reasoning,Poster,1.1,96,F,,,\n55,Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering,Seungryul Baek (Imperial College London)*; Kwang In Kim (UNIST); Tae-Kyun Kim (Imperial College London),\"Face, Gesture, and Body Pose\",3D from Single Image; Deep Learning ; Motion and Tracking; Vision Applications and Systems,Poster,1.1,97,G,,,\n117,Self-Supervised Learning of 3D Human Pose using Multi-view Geometry,Muhammed Kocabas (Middle East Technical University)*; Salih Karagoz (Middle East Technical University); Emre Akbas (Middle East Technical University),\"Face, Gesture, and Body Pose\",3D from Single Image,Poster,1.1,98,G,,,\n191,FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation from a Single Image,\"Tsun-Yi Yang (Scape Technologies; Academia Sinica, Taiwan; National Taiwan University, Taiwan)*; Yi-Ting Chen (Academia Sinica); Yen-Yu Lin (Academia Sinica); Yung-Yu Chuang (National Taiwan University)\",\"Face, Gesture, and Body Pose\",\"Computer Vision Theory; Deep Learning ; Recognition: Detection, Categorization, Retrieval; RGBD sens\",Poster,1.1,99,G,,,\n332,Dense 3D Face Decoding over 2500FPS: Joint Texture \\& Shape Convolutional Mesh Decoders,Yuxiang Zhou (Imperial College London); Jiankang Deng (Imperial College London)*; Irene Kotsia (Hellenic Open University); Stefanos Zafeiriou (Imperial College Londong),\"Face, Gesture, and Body Pose\",3D from Single Image; Deep Learning ; Representation Learning; RGBD sensors and analytics; Segmentat,Poster,1.1,100,G,,,\n370,Does Learning Specific Features for Related Parts Help Human Pose Estimation?,Wei Tang (Northwestern University)*; Ying Wu (Northwestern University),\"Face, Gesture, and Body Pose\",,Poster,1.1,101,G,,,\n410,Linkage Based Face Clustering via Graph Convolution Network,Zhongdao Wang (Tsinghua University)*; Liang Zheng (Australian National University); Ya-Li Li (THU); Shengjin Wang (Tsinghua University),\"Face, Gesture, and Body Pose\",Deep Learning,Poster,1.1,102,G,,,\n514,Towards High-fidelity Nonlinear 3D Face Morphoable Model,Luan Tran (Michigan State University)*; Feng Liu (Michigan State University); Xiaoming Liu (Michigan State University),\"Face, Gesture, and Body Pose\",3D from Single Image,Poster,1.1,103,G,,,\n552,Deep Face Recognition via Exclusive Regularization,Kai Zhao (Nankai University)*; Jingyi Xu (Nankai University); Ming-Ming Cheng (Nankai University),\"Face, Gesture, and Body Pose\",,Poster,1.1,104,G,,,\n808,BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation,Wanhua Li (Tsinghua University); Jiwen Lu (Tsinghua University)*; Jianjiang Feng (Tsinghua University); Chunjing Xu (Huawei Noah's Ark Lab); Jie Zhou (Tsinghua University); Qi Tian (Huawei Noah’s Ark Lab),\"Face, Gesture, and Body Pose\",Representation Learning,Poster,1.1,105,G,,,\n821,GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction,Baris Gecer (Imperial College London)*; Stylianos Ploumpis (Imperial College London); Irene Kotsia (Hellenic Open University); Stefanos Zafeiriou (Imperial College Londong),\"Face, Gesture, and Body Pose\",3D from Single Image; Deep Learning ; Image and Video Synthesis; Vision + Graphics,Poster,1.1,106,G,,,\n984,Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training,Mahdi Abavisani (Rutgers University)*; HAMID VAEZI JOZE (Microsoft); Vishal Patel (Johns Hopkins University),\"Face, Gesture, and Body Pose\",Action Recognition ; RGBD sensors and analytics,Poster,1.1,107,G,,,\n1043,Learning to Reconstruct People in Clothing from a Single RGB Camera,\"Thiemo Alldieck (TU Braunschweig)*; Marcus Magnor (TU Braunschweig); Bharat Lal Bhatnagar (MPI-INF); Christian Theobalt (MPI Informatik); Gerard Pons-Moll (MPII, Germany)\",\"Face, Gesture, and Body Pose\",,Poster,1.1,108,G,,,\n1157,Distilled Person Re-identification: Towards a More Scalable System,\"Ancong Wu (Sun Yat-sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)*; Xiaowei Guo (Tencent Youtu Lab); Jian-Huang Lai (Sun Yat-sen University)\",\"Face, Gesture, and Body Pose\",\"Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,1.1,109,G,,,\n292,Video Action Transformer Network,Rohit Girdhar (Carnegie Mellon University)*; Joao Carreira (DeepMind); Carl Doersch (DeepMind); Andrew Zisserman (University of Oxford),Action Recognition,Deep Learning,Oral,1.1,110,H,,,\n302,Timeception for Complex Action Recognition,Noureldien Hussein (University of Amsterdam)*; Stratis Gavves (University of Amsterdam); Arnold W.M. Smeulders (University of Amsterdam),Action Recognition,Video Analytics,Oral,1.1,111,H,,,\n1670,STEP: Spatio-Temporal Progressive Learning for Video Action Detection,Xitong Yang (University of Maryland)*; Xiaodong Yang (NVIDIA Research); Ming-Yu Liu (NVIDIA); Fanyi Xiao (University of California Davis); Larry Davis (University of Maryland); Jan Kautz (NVIDIA),Video Analytics,,Oral,1.1,112,H,,,\n1745,Relational Action Forecasting,Chen Sun (Google)*; Abhinav Shrivastava (University of Maryland); Carl Vondrick (Columbia University); Rahul Sukthankar (Google); Kevin Murphy (Google); Cordelia Schmid (Google),Action Recognition,Video Analytics,Oral,1.1,113,H,,,\n2310,Long-Term Feature Banks for Detailed Video Understanding,Chao-Yuan Wu (UT Austin)*; Christoph Feichtenhofer (Facebook AI Research); Haoqi Fan (Facebook AI Research); Kaiming He (Facebook AI Research); Philipp Kraehenbuehl (UT Austin); Ross Girshick (FAIR),Action Recognition,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding;\",Oral,1.1,114,H,,,\n229,Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes,Yuke Li (York University)*,Motion and Tracking,Video Analytics,Oral,1.1,115,H,,,\n3106,What and How Well You Performed? A Multitask Approach To Action Quality Assessment,Paritosh Parmar (UNLV)*; Brendan Morris (UNLV),\"Face, Gesture, and Body Pose\",Action Recognition ; Datasets and Evaluation; Video Analytics,Oral,1.1,116,H,,,\n1382,MHP-VOS: Video Object Segmentation with Multiple Hypotheses Propagation,Shuangjie Xu (Huazhong University of Science and Technology); Daizong Liu (Huazhong University of Science and Technology); Linchao Bao (Tencent AI Lab)*; Wei Liu (Tencent); Pan Zhou ( Huazhong University of Science and Technology),Video Analytics,\"Motion and Tracking; Segmentation, Grouping and Shape\",Oral,1.1,117,H,,,\n1517,2.5D Visual Sound,Ruohan Gao (University of Texas at Austin)*; Kristen Grauman (Facebook AI Research & UT Austin),Video Analytics,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Oral,1.1,118,H,,,\n1999,Language-driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model,\"Weining Wang (Institute of Automation, Chinese Academy of Sciences)*; Yan Huang (Institute of Automation, Chinese Academy of Sciences); Liang Wang (NLPR, China)\",Video Analytics,Vision + Language,Oral,1.1,119,H,,,\n5591,Gaussian Temporal Awareness Networks for Action Localization,Fuchen Long (University of Science and Technology of China); Ting Yao (JD AI Research)*; Zhaofan Qiu (University of Science and Technology of China); Xinmei Tian (USTC); Jiebo Luo (U. Rochester); Tao Mei (AI Research of JD.com),Video Analytics,Action Recognition,Oral,1.1,120,H,,,\n6940,Efficient Video Classification Using Fewer Frames,\"Shweta Bhardwaj (Indian Institute of Technology Madras, Chennai)*; Mukundhan Srinivasan (NVIDIA); Mitesh M. Khapra (Indian Institute of Technology Madras)\",Video Analytics,Deep Learning ; Vision Applications and Systems,Oral,1.1,121,H,,,\n239,A Perceptual Prediction Framework for Self Supervised Event Segmentation,\"Sathyanarayanan N Aakur (University of South Florida)*; Sudeep Sarkar (University of South Florida, Tampa)\",Video Analytics,Deep Learning ; Representation Learning,Poster,1.1,122,H,,,\n260,COIN: A Large-scale Dataset for Comprehensive Instruction Video Analysis,\"Yansong Tang (Tsinghua University); Dajun Ding (Meitu, Inc.); Yongming Rao (Tsinghua University); Yu Zheng (Tsinghua University); Danyang Zhang (Tsinghua University); Lili Zhao (Meitu); Jiwen Lu (Tsinghua University)*; Jie Zhou (Tsinghua University)\",Action Recognition,Datasets and Evaluation; Video Analytics,Poster,1.1,123,H,,,\n484,Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization,Chenchen Liu (Peking University); Xinyu Weng (Peking University); Yadong Mu (Peking University)*,Video Analytics,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,124,H,,,\n662,An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition,\"Chenyang Si (Institute of Automation, Chinese Academy of Sciences)*; Wentao Chen (CASIA); Wei Wang (Institute of Automation Chinese Academy of Sciences); Liang Wang (NLPR, China); Tieniu Tan (NLPR, China)\",Action Recognition,\"Face, Gesture, and Body Pose ; Motion and Tracking; Video Analytics\",Poster,1.1,125,H,,,\n727,Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection,\"Jia-Xing Zhong (School of Electronic and Computer Engineering, Peking University); Nannan Li (Peking University Shenzhen Graduate School); Weijie Kong (School Electronic and Computer Engineering, Peking University); Shan Liu (Tencent America); Thomas H Li (Advanced Institute of Information Technology, Peking University); Ge Li (SECE, Shenzhen Graduate School, Peking University)*\",Video Analytics,Action Recognition ; Vision Applications and Systems,Poster,1.1,126,H,,,\n747,MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment,\"Da Zhang (UC Santa Barbara)*; Xiyang Dai (UMD); Xin Wang (University of California, Santa Barbara); Yuan-Fang Wang (UC Santa Barbara); Larry Davis (University of Maryland)\",Video Analytics,Vision + Language,Poster,1.1,127,H,,,\n762,Less is More: Learning Highlight Detection from Video Duration,Bo Xiong (University of Texas at Austin)*; Yannis Kalantidis (Facebook Research); Deepti Ghadiyaram (Facebook); Kristen Grauman (Facebook AI Research & UT Austin),Video Analytics,\"Action Recognition ; Recognition: Detection, Categorization, Retrieval\",Poster,1.1,128,H,,,\n1054,DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition,Zheng Shou (Columbia University); Zhicheng Yan (Facebook AI)*; Yannis Kalantidis (Facebook Research); Laura Sevilla-Lara (Facebook); Marcus Rohrbach (Facebook AI Research); Xudong Lin (Columbia University); Shih-Fu Chang (Columbia University),Action Recognition,,Poster,1.1,129,H,,,\n1099,AdaFrame: Adaptive Frame Selection for Fast Video Recognition,Zuxuan Wu (UMD)*; Caiming Xiong (Salesforce Research); Chih-Yao Ma (Georgia Institute of Technology); Richard Socher (Salesforce); Larry Davis (University of Maryland),Video Analytics,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,130,H,,,\n1134,Spatio-temporal Video Re-localization by Warp LSTM,Yang Feng (University of Rochester)*; Lin Ma (Tencent AI Lab); Wei Liu (Tencent); Jiebo Luo (U. Rochester),Video Analytics,,Poster,1.1,131,H,,,\n1273,Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization,Daochang Liu (Peking University); Tingting Jiang (Peking University)*; Yizhou Wang (PKU),Action Recognition,Video Analytics,Poster,1.1,132,H,,,\n629,Unsupervised Deep Tracking,Ning Wang (University of Science and Technology of China)*; Yibing Song (Tencent AI Lab); Chao Ma (Shanghai Jiao Tong University); Wengang Zhou (University of Science and Technology of China); Wei Liu (Tencent); Houqiang Li (University of Science and Technology of China),Motion and Tracking,Representation Learning,Poster,1.1,133,I,,,\n648,Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers,Zhen He (NUDT)*; Jian Li (NUDT); Daxue Liu (NUDT); Hangen He (NUDT); David Barber (UCL),Motion and Tracking,Deep Learning ; Representation Learning; Vision + Graphics ; Visual Reasoning,Poster,1.1,134,I,,,\n699,Fast Online Object Tracking and Segmentation: A Unifying Approach,Qiang Wang (Oxford); Li Zhang (University of Oxford)*; Luca Bertinetto (University of Oxford); Weiming Hu (Institute of Automation，Chinese Academy of Sciences); Philip Torr (University of Oxford),Motion and Tracking,Deep Learning ; Vision Applications and Systems,Poster,1.1,135,I,,,\n743,Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters,Ugur Kart (Tampere University of Technology)*; Alan Lukezic (University of Ljubljana); Matej Kristan (University of Ljubljana); Joni-Kristian Kamarainen (Tampere University); Jiri Matas (CMP CTU FEE),Motion and Tracking,RGBD sensors and analytics,Poster,1.1,136,I,,,\n992,SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints,Amir A Sadeghian (Stanford)*; Vineet S Kosaraju (Stanford Vision & Learning Lab); Ali Sadeghian (University of Florida); Noriaki Hirose (Stanford University); Hamid Rezatofighi (Stanford University); Silvio Savarese (Stanford University),Motion and Tracking,Vision Applications and Systems,Poster,1.1,137,I,,,\n1103,Leveraging Shape Completion for 3D Siamese Tracking,Silvio Giancola (KAUST)*; Jesus Zarzar (KAUST); Bernard Ghanem (KAUST),Motion and Tracking,Robotics + Driving,Poster,1.1,138,I,,,\n1225,Target-Aware Deep Tracking,\"Xin Li (Harbin Institute of Technology, Shenzhen); Chao Ma (Shanghai Jiao Tong University); Baoyuan Wu (Tencent AI Lab); Zhenyu He (Harbin Institute of Technology (Shenzhen))*; Ming-Hsuan Yang (University of California at Merced)\",Motion and Tracking,,Poster,1.1,139,I,,,\n1258,Spatiotemporal CNN for Video Object Segmentation,Kai Xu (University of Chinese Academy of Sciences); Longyin Wen (JD Digits)*; guorong Li (CAS); Liefeng Bo (JD Finance); Qingming Huang (University of Chinese Academy of Sciences),Motion and Tracking,\"Segmentation, Grouping and Shape; Video Analytics\",Poster,1.1,140,I,,,\n1335,Towards Rich Feature Discovery with Class Activation Maps Augmentation for Person Re-Identification,\"Wenjie Yang (Institute of Automation, Chinese Academy of Sciences)*; Houjing Huang (CASIA); Zhang Zhang (Institute of Automation, Chinese Academy of Sciences); Xiaotang Chen (Institute of Automation, Chinese Academy of Sciences); Kaiqi Huang (Institute of Automation, Chinese Academy of Sciences); Shu Zhang (Deepwise AI Lab)\",Biometrics,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,1.1,141,I,,,\n577,Semantic Regeneration Network,Yi Wang (Chinese University of Hong Kong)*; Xin Tao (Tencent); Xiaoyong Shen (Tencent); Jiaya Jia (Chinese University of Hong Kong),Image and Video Synthesis,Deep Learning,Poster,1.1,142,J,,,\n673,End-to-End Time-lapse Video Synthesis from a Single Outdoor Image,Seonghyeon Nam (Yonsei University); Chongyang Ma (Kwai Inc.); Menglei Chai (Snap Inc.); William Brendel (Snap Inc.); Ning Xu (Snap); Seon Joo Kim (Yonsei Univ.)*,Image and Video Synthesis,Deep Learning ; Video Analytics,Poster,1.1,143,J,,,\n710,GIF2Video: Color Dequantization and Temporal Interpolation of GIF images,Yang Wang (Stony Brook University)*; Haibin Huang (Face++ (Megvii)); Chuan Wang (Face++ (Megvii)); Tong He (UCLA); jue wang (Face++ (Megvii)); Minh Hoai Nguyen (Stony Brook University),Image and Video Synthesis,Vision Applications and Systems,Poster,1.1,144,J,,,\n869,Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis,\"Qi Mao (Peking University)*; Hsin-Ying Lee (University of California, Merced); Hung-Yu Tseng (University of California, Merced); Siwei Ma (Peking University, China); Ming-Hsuan Yang (University of California at Merced)\",Image and Video Synthesis,Deep Learning,Poster,1.1,145,J,,,\n963,Pluralistic Image Completion,Chuanxia Zheng (Nanyang Technological University)*; Tat-Jen Cham (Nanyang Technological University); Jianfei Cai (Nanyang Technological University),Image and Video Synthesis,Deep Learning,Poster,1.1,146,J,,,\n1005,PAGE-Net: Salient Object Detection with Pyramid Attention and Salient Edge,Wenguan Wang (Inception Institute of Artificial Intelligence); Shuyang Zhao (Beijing Institute of Technology ); Jianbing Shen (Beijing Institute of Technology)*; Steven Hoi (SMU); Ali Borji (University of Central Florida),Image and Video Synthesis,Low-level Vision,Poster,1.1,147,J,,,\n1038,Latent Filter Scaling for Multimodal Unsupervised Image-to-Image Translation,Yazeed Alharbi (KAUST)*; Neil Smith (King Abdullah University of Science and Technology (KAUST)); Peter Wonka (KAUST),Image and Video Synthesis,Deep Learning,Poster,1.1,148,J,,,\n1040,Attention-aware Multi-stroke Style Transfer,Yuan Yao (Tsinghua University)*; Jianqiang Ren (Alibaba); Xuansong Xie (Alibaba); Weidong Liu (Tsinghua University); Yong-Jin Liu (Tsinghua University); Jun Wang (UCL),Image and Video Synthesis,Deep Learning ; Vision Applications and Systems,Poster,1.1,149,J,,,\n1111,Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks,Jacob Huh (Carnegie Mellon University)*; Shao-Hua Sun (University of Southern California); Ning Zhang (UC Berkeley),Image and Video Synthesis,Computational Photography; Deep Learning,Poster,1.1,150,J,,,\n1127,Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting,Yanhong Zeng (Sun Yat-sen University)*; Jianlong Fu (Microsoft Research); Hongyang Chao (Sun Yat-sen University); Baining Guo (MSR Asia),Image and Video Synthesis,Computational Photography,Poster,1.1,151,J,,,\n1135,Example-Guided Image Synthesis using Adversarial Networks with Genre Consistency,Miao Wang (Beihang University)*; Guo-Ye Yang (Tsinghua University); Ruilong Li (Tsinghua University); Run-Ze Liang (Tsinghua University); Song-Hai Zhang (Tsinghua University); Peter Hall (University of Bath); Shimin Hu (Tsinghua University),Image and Video Synthesis,Deep Learning,Poster,1.1,152,J,,,\n1166,MirrorGAN: Learning Text-to-image Generation by Redescription,Tingting Qiao (Zhejiang University)*; Jing Zhang (University of Technology Sydney); Dacheng Tao (University of Sydney); Duanqing Xu (Zhejiang University),Image and Video Synthesis,Deep Learning ; Vision + Language,Poster,1.1,153,J,,,\n270,Light Field Messaging with Deep Photographic Steganography,Eric Wengrowski (Rutgers University)*; Kristin Dana (Rutgers University),Computational Photography,Physics-based Vision and Shape-from-X,Poster,1.1,154,K,,,\n719,Im2Pencil: Controllable Pencil Illustration from Photographs,\"Yijun Li (University of California, Merced)*; Chen Fang (ByteDance); Aaron Hertzmann (Adobe Research); Eli Shechtman (Adobe Research, US); Ming-Hsuan Yang (University of California at Merced)\",Vision + Graphics,Image and Video Synthesis,Poster,1.1,155,K,,,\n797,When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images,Mahmoud Afifi (York University)*; Brian Price (Adobe); Scott Cohen (Adobe Research); Michael S Brown (York University),Computational Photography,Low-level Vision; Vision Applications and Systems,Poster,1.1,156,K,,,\n856,Beyond Volumetric Albedo --- A Surface Optimization Framework for Non-Line-of-Sight Imaging,Chia-Yin Tsai (Carnegie Mellon Unversity)*; Aswin Sankaranarayanan (Carnegie Mellon University); Ioannis Gkioulekas (Carnegie Mellon University),Computational Photography,Physics-based Vision and Shape-from-X,Poster,1.1,157,K,,,\n859,Reflection Removal Using A Dual-Pixel Sensor,Abhijith Punnappurath (York University)*; Michael S Brown (York University),Computational Photography,Low-level Vision,Poster,1.1,158,K,,,\n1060,Practical Coding Function Design for Time of Flight Imaging,\"Felipe Gutierrez-Barragan (University of Wisconsin-Madison)*; Syed Azer Reza (University of Wisconsin-Madison); Andreas Velten (University of Wisconsin - Madison); Mohit Gupta (\"\"University of Wisconsin-Madison, USA \"\")\",Computational Photography,Physics-based Vision and Shape-from-X,Poster,1.1,159,K,,,\n1227,Zoom in with Meta-SR: A Magnification-Arbitrary Network for Super-Resolution,\"Xuecai Hu (USTC)*; Haoyuan Mu (Tsinghua University); Xiangyu Zhang (Megvii Inc); Zilei Wang (University of Science and Technology of China); Jian Sun (Megvii Technology); Tieniu Tan (NLPR, China)\",Computational Photography,Low-level Vision,Poster,1.1,160,K,,,\n60,Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net,\"Qi Xie (Xi'an Jiaotong University)*; Minghao Zhou (Xi'an Jiaotong University); Deyu Meng (Xi'an Jiaotong University); Qian Zhao (Xi'an Jiaotong University ); Wangmeng Zuo (Harbin Institute of Technology, China); Zongben Xu (Xi'an Jiaotong University)\",Low-level Vision,Computational Photography; Deep Learning,Poster,1.1,161,L,,,\n145,Learning Attraction Field Representation for Robust Line Segment Detection,Nan Xue (Wuhan University); Song Bai (University of Oxford); Fudong Wang (Wuhan University); Gui-Song Xia (Wuhan University)*; Tianfu Wu (NC State University); Liangpei Zhang ( Wuhan University),Low-level Vision,\"Deep Learning ; Segmentation, Grouping and Shape\",Poster,1.1,162,L,,,\n219,Blind Super-Resolution With Iterative Kernel Correction,\"Jinjin Gu (The Chinese University of Hong Kong, Shenzhen)*; Hannan Lu (Harbin Institute of Technology); Wangmeng Zuo (Harbin Institute of Technology, China); Chao Dong (SIAT)\",Low-level Vision,,Poster,1.1,163,L,,,\n435,Video Magnification in the Wild: Using Fractional Anisotropy in Temporal Distribution,\"Shoichiro Takeda (NTT Corporation)*; Yasunori Akagi (NTT Service Evolution Laboratories, NTT Corporation); Kazuki Okami (NTT Corporation); Megumi Isogai (NTT Media Intelligence Laboratories); Hideaki Kimata (NTT)\",Low-level Vision,Image and Video Synthesis,Poster,1.1,164,L,,,\n438,Attentive Feedback Network for Boundary-Aware Salient Object Detection,Mengyang Feng (Dalian University of Technology); Huchuan Lu (Dalian University of Technology)*; Errui Ding (Baidu Inc.),Low-level Vision,\"Segmentation, Grouping and Shape\",Poster,1.1,165,L,,,\n444,Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning,\"Ruoteng Li (National University of Singapore)*; Loong Fah Cheong (NUS); Robby Tan (\"\"Yale-NUS College, Singapore\"\")\",Low-level Vision,Image and Video Synthesis,Poster,1.1,166,L,,,\n515,Learning to Calibrate Straight Lines for Fisheye Image Rectification,Zhucun Xue (WuHan university); Nan Xue (Wuhan University); Gui-Song Xia (Wuhan University)*; Weiming Shen (Wuhan University),Low-level Vision,Computational Photography,Poster,1.1,167,L,,,\n584,Camera Lens Super-Resolution,Chang Chen (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China)*; Xinmei Tian (USTC); Zheng-Jun Zha (University of Science and Technology of China); Feng Wu (University of Science and Technology of China),Low-level Vision,,Poster,1.1,168,L,,,\n586,Frame-Consistent Recurrent Video Deraining with Dual-Level Flow,Wenhan Yang (City University of Hong Kong)*; Jiaying Liu (Peking University); Jiashi Feng (NUS),Low-level Vision,Computational Photography,Poster,1.1,169,L,,,\n675,Deep Plug-and-Play Super-Resolution for Arbitrary Blur Kernels,\"Kai Zhang (Harbin Institute of Technology); Wangmeng Zuo (Harbin Institute of Technology, China)*; Lei Zhang (\"\"Hong Kong Polytechnic University, Hong Kong, China\"\")\",Low-level Vision,,Poster,1.1,170,L,,,\n695,Sea-thru: A Method to Remove Water From Underwater Images,Derya Akkaynak (Dr.)*; Tali Treibitz (University of Haifa),Low-level Vision,Computational Photography,Poster,1.1,171,L,,,\n740,Deep Network Interpolation for Continuous Imagery Effect Transition,Xintao Wang (The Chinese University of Hong Kong)*; Ke Yu (The Chinese University of Hong Kong); Chao Dong (SIAT); Xiaoou Tang (The Chinese University of Hong Kong); Chen Change Loy (Nanyang Technological University),Low-level Vision,,Poster,1.1,172,L,,,\n759,Spatially Variant Linear Representation Models for Joint Filtering,Jinshan Pan (Nanjing University of Science and Technology)*; Jiangxin Dong (Dalian University of Technology); Jimmy Ren (SenseTime Research); Liang Lin (Sun Yat-sen University); Jinhui Tang (Nanjing University of Science and Technology); Ming-Hsuan Yang (University of California at Merced),Low-level Vision,Computational Photography,Poster,1.1,173,L,,,\n916,Toward Convolutional Blind Denoising of Real-world Noisy Photographs,\"Shi Guo (Harbin Institute of Technology); Zifei Yan (Harbin Institute of Technology); Kai Zhang (Harbin Institute of Technology); Wangmeng Zuo (Harbin Institute of Technology, China)*; Lei Zhang (\"\"Hong Kong Polytechnic University, Hong Kong, China\"\")\",Low-level Vision,,Poster,1.1,174,L,,,\n921,Towards Real Scene Super-Resolution with Raw Images,Xiangyu Xu (Tsinghua University)*; Yongrui Ma (sensetime); Wenxiu Sun (SenseTime Research),Low-level Vision,Deep Learning,Poster,1.1,175,L,,,\n1034,ODE-inspired Network Design for Single Image Super-Resolution,\"Xiangyu He (Institute of Automation, Chinese Academy of Sciences)*; Zitao Mo (Chinese Academy of Sciences); Peisong Wang (Institute of Automation, Chinese Academy of Sciences); Yang Liu (Alibaba Group); Mingyuan Yang (Alibaba); Jian Cheng (\"\"Chinese Academy of Sciences, China\"\")\",Low-level Vision,,Poster,1.1,176,L,,,\n1075,Blind Image Deblurring With Local Maximum Gradient Prior,Liang Chen (East China Normal University); Faming Fang (East China Normal University)*; Tingting Wang (East China Normal University); Guixu Zhang (East China Normal University),Low-level Vision,Optimization Methods,Poster,1.1,177,L,,,\n1318,Attention-guided Network for Ghost-free High Dynamic Range Imaging,\"Qingsen Yan (Northwestern Polytechnical University); Dong Gong (The University of Adelaide)*; Qinfeng Shi (University of Adelaide); Anton van den Hengel (University of Adelaide); Chunhua Shen (University of Adelaide); Ian Reid (\"\"University of Adelaide, Australia\"\"); Yanning Zhang (Northwestern Polytechnical University)\",Low-level Vision,,Poster,1.1,178,L,,,\n80,Searching for A Robust Neural Architecture in Four GPU Hours,Xuanyi Dong (UTS)*; Yi Yang (UTS),Representation Learning,Deep Learning,Poster,1.1,179,M,,,\n214,Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction,Yifei Shi (National University of Defense Technology); Angel X Chang (Eloquent Labs); Manolis Savva (Simon Fraser University); Zhenlun Wu (Princeton University); Kai Xu (National University of Defense Technology)*,Scene Analysis and Understanding,\"Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Poster,1.1,180,M,,,\n242,Adaptively-Connected Neural Network,\"Guangrun Wang (Sun Yat-sen University)*; Keze Wang (University of California, Los Angeles); Liang Lin (Sun Yat-sen University)\",Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,181,M,,,\n281,CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency,Yun-Chun Chen (Academia Sinica)*; Yen-Yu Lin (Academia Sinica); Ming-Hsuan Yang (University of California at Merced); Jia-Bin Huang (Virginia Tech),Representation Learning,\"Segmentation, Grouping and Shape\",Poster,1.1,182,M,,,\n282,Temporal Cycle-Consistency Learning,Debidatta Dwibedi (Google)*; Yusuf Aytar (DeepMind); Jonathan Tompson (Google); Pierre Sermanet (Google); Andrew Zisserman (University of Oxford),Representation Learning,\"Recognition: Detection, Categorization, Retrieval; Video Analytics; Visual Reasoning\",Poster,1.1,183,M,,,\n369,Predicting Future Frames using Retrospective Cycle GAN,Yong-Hoon Kwon (LG Electronics)*; Min-Gyu Park (Korea Electronics Technology Institute),Scene Analysis and Understanding,Image and Video Synthesis,Poster,1.1,184,M,,,\n390,Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization,Dongze Lian (Shanghaitech University)*; Jing Li (ShanghaiTech University); Jia Zheng (ShanghaiTech University); Weixin Luo (Shanghaitech University); Shenghua Gao (Shanghaitech University),Scene Analysis and Understanding,RGBD sensors and analytics,Poster,1.1,185,M,,,\n440,TAFE-Net: Task-Aware Feature Embeddings for Efficient Learning and Inference,Xin Wang (UC Berkeley)*; Fisher Yu (UC Berkeley); Ruth Wang (UC Berkeley); Trevor Darrell (UC Berkeley); Joseph Gonzalez (UC Berkeley),Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,186,M,,,\n642,Learning Semantic Segmentation from Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach,Yuhua Chen (ETH Zurich)*; Wen Li (ETH Zurich); Xiaoran Chen (ETH Zurich); Luc Van Gool (ETH Zurich),Scene Analysis and Understanding,\"Deep Learning ; Image and Video Synthesis; Recognition: Detection, Categorization, Retrieval\",Poster,1.1,187,M,,,\n651,Attentive Single-tasking of Multiple Tasks,Kevis-Kokitsi Maninis (Eidgenössiche Technische Hochschule Zürich)*; Ilija Radosavovic (Facebook AI Research); Iasonas Kokkinos (UCL),Scene Analysis and Understanding,Deep Learning,Poster,1.1,188,M,,,\n718,FastAP: Deep Metric Learning to Rank,Kun He (Facebook Reality Labs)*; Fatih Cakir (Boston University); XIDE XIA (Boston University); Brian Kulis (Boston University); Stan Sclaroff (Boston University),Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.1,189,M,,,\n829,End-to-End Multi-Task Learning with Attention,Shikun Liu (Imperial College London)*; Edward Johns (Imperial College London); Andrew Davison (Imperial College London),Representation Learning,\"Deep Learning ; Segmentation, Grouping and Shape\",Poster,1.1,190,M,,,\n1097,Self-Supervised Learning via Conditional Motion Propagation,Xiaohang Zhan (The Chinese University of Hong Kong)*; Xingang Pan (The Chinese University of Hong Kong); Ziwei Liu (The Chinese University of Hong Kong); Dahua Lin (The Chinese University of Hong Kong); Chen Change Loy (Nanyang Technological University),Representation Learning,Scene Analysis and Understanding,Poster,1.1,191,M,,,\n1173,Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence,Hsueh-Ying Lai (National Chiao Tung University); Yi-Hsuan Tsai (NEC Labs America); Wei-Chen Chiu (National Chiao Tung University)*,Scene Analysis and Understanding,Deep Learning,Poster,1.1,192,M,,,\n1176,All about Structure: Adapting Structural Information across Domains for Boosting Semantic Segmentation,Wei-Lun Chang (National Chiao Tung University); Hui-Po Wang (National Chiao Tung University); Wen-Hsiao Peng (National Chiao Tung University); Wei-Chen Chiu (National Chiao Tung University)*,Scene Analysis and Understanding,Deep Learning,Poster,1.1,193,M,,,\n1199,Iterative Reorganization with Weak Spatial Constraints:Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning,Chen Wei (Peking University); Lingxi Xie (Johns Hopkins University)*; Xutong Ren (Peking University); Yingda Xia (Johns Hopkins University); Chi Su (Kingsoft Cloud); Jiaying Liu (Peking University); Qi Tian (Huawei Noah’s Ark Lab); Alan Yuille (Johns Hopkins University),Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,194,M,,,\n1306,Revisiting Self-Supervised Visual Representation Learning,Alexander Kolesnikov (Google Brain)*; Xiaohua Zhai (Google Brain); Lucas Beyer (Google Brain),Representation Learning,Deep Learning ; Low-level Vision,Poster,1.1,195,M,,,\n31,It’s not about the Journey; It’s about the Destination: Following Soft Paths under Question-Guidance for Visual Reasoning,Monica Haurilet (KIT)*; Alina Roitberg (KIT); Rainer Stiefelhagen (Karlsruhe Institute of Technology),Visual Reasoning,Vision + Language,Poster,1.1,196,N,,,\n340,Actively Seeking and Learning from Live Data,Damien Teney (The University of Adelaide)*; Anton van den Hengel (University of Adelaide),Vision + Language,Visual Reasoning,Poster,1.1,197,N,,,\n409,Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing,\"Xihui Liu (The Chinese University of Hong Kong)*; Zihao Wang (Sensetime); Hongsheng Li (Chinese University of Hong Kong); Jing Shao (Sensetime); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",Vision + Language,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,198,N,,,\n598,Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks,Peng Wang (The University of Adelaide)*; Qi Wu (University of Adelaide); Jiewei Cao (The University of Adelaide); Chunhua Shen (University of Adelaide); Lianli Gao (The University of Electronic Science and Technology of China); Anton van den Hengel (University of Adelaide),Vision + Language,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.1,199,N,,,\n622,Scene Graph Generation with External Knowledge and Image Reconstruction,Jiuxiang Gu (Nanyang Technological University)*; Handong Zhao (Adobe Research); Zhe Lin (Adobe Research); Sheng Li (University of Georgia); Jianfei Cai (Nanyang Technological University); Mingyang Ling (Google Cloud AI),Visual Reasoning,Vision + Language,Poster,1.1,200,N,,,\n674,Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval,Yale Song (Microsoft)*; Mohammad Soleymani (University of Southern California),Vision + Language,\"Recognition: Detection, Categorization, Retrieval; Video Analytics\",Poster,1.1,201,N,,,\n846,MUREL: Multimodal Relational Reasoning for Visual Question Answering,\"Hedi Ben-younes (Sorbonne université)*; Remi Cadene (LIP6); Matthieu Cord (Sorbonne University); Nicolas Thome (CNAM, Paris)\",Vision + Language,Scene Analysis and Understanding; Visual Reasoning,Poster,1.1,202,N,,,\n1022,Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering,Chenyou Fan (Indiana University)*; Xiaofan Zhang (JD.com); Shu Zhang (JD.com); Wensheng Wang (JD.com); Chi Zhang (JD.com); Heng Huang (University of Pittsburgh),Visual Reasoning,Deep Learning,Poster,1.1,203,N,,,\n1151,Information Maximizing Visual Question Generation,Ranjay Krishna (Stanford University)*; Michael Bernstein (Stanford University); Li Fei-Fei (Stanford University),Vision + Language,Scene Analysis and Understanding,Poster,1.1,204,N,,,\n1235,Learning to Detect Human-Object Interactions with Knowledge,\"Bingjie Xu (National University of Singapore)*; Wong Yongkang (National University of Singapore); Junnan Li (National University of Singapore); Qi Zhao (University of Minnesota); Mohan Kankanhalli (National University of Singapore,)\",Visual Reasoning,Scene Analysis and Understanding; Vision + Language,Poster,1.1,205,N,,,\n1298,Learning Words by Drawing Images,Adria Recasens (Massachusetts Institute of Technology)*; Dídac Surís (University of Toronto); David Bau (MIT); David Harwath (MIT CSAIL); James Glass (MIT); Antonio Torralba (MIT),Vision + Language,Visual Reasoning,Poster,1.1,206,N,,,\n5401,Factor Graph Attention,Idan Schwartz (Technion)*; Seunghak Yu (Samsung Research); Tamir Hazan (Technion); Alexander Schwing (UIUC),Vision + Language,Visual Reasoning,Poster,1.1,207,N,,,\n329,Reducing Uncertainty in Undersampled MRI Reconstruction with Active Acquisition,Zizhao Zhang (University of Florida)*; Adriana Romero (FAIR); Matthew Muckley (New York University); Pascal Vincent (Facebook FAIR & MILA Université de Montréal); Lin Yang (University of Florida); Michal Drozdzal (FAIR),\"Medical, Biological and Cell Microscopy\",Image and Video Synthesis,Poster,1.1,208,O,,,\n389,ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification,Fangneng Zhan (Nanyang Technological University); Shijian Lu (Nanyang Technological University)*,Document Analysis,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.1,209,O,,,\n545,ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape,Fabian Manhardt (TU Munich); Wadim Kehl (Toyota Research Institute)*; Adrien Gaidon (Toyota Research Institute),Robotics + Driving,3D from Single Image; Deep Learning,Poster,1.1,210,O,,,\n810,Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images,Yi Zhou (Inception Institute of Artificial Intelligence)*; Xiaodong He (Inception Institute of Artificial Intelligence); Lei Huang (the inception institute of artificial intelligence); Li Liu (the inception institute of artificial intelligence); Fan Zhu (Inception Institute of Artificial Intelligence); Shanshan Cui (Inception Institute of Artificial Intelligence); Ling Shao (Inception Institute of Artificial Intelligence),\"Medical, Biological and Cell Microscopy\",\"Biometrics ; Deep Learning ; Representation Learning; Segmentation, Grouping and Shape\",Poster,1.1,211,O,,,\n949,Biologically-Constrained Graphs for Global Connectomics Reconstruction,Brian Matejek (Harvard University )*; Daniel Haehn (Harvard University); Haidong Zhu (Tsinghua University); Donglai Wei (Harvard/SEAS); Toufiq Parag (Harvard University); Hanspeter Pfister (Harvard University),\"Medical, Biological and Cell Microscopy\",\"Segmentation, Grouping and Shape\",Poster,1.1,212,O,,,\n974,P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification,\"Bingzhe Wu (Peking University)*; Shiwan Zhao (IBM Research); Guangyu Sun (Peking University); Xiaolu Zhang (Ant Financial Services Group); Zhong Su (IBM Research); Caihong Zeng (National Clinical Research Center of Kidney Diseases, Jinling Hospital, Nanjing University School of Medicine); Zhihong Liu (National Clinical Research Center of Kidney Diseases, Jinling Hospital, Nanjing University School of Medicine)\",\"Medical, Biological and Cell Microscopy\",Deep Learning ; Vision Applications and Systems,Poster,1.1,213,O,,,\n1070,Elastic Boundary Projection for 3D Medical Imaging Segmentation,Tianwei Ni (Peking University); Lingxi Xie (Johns Hopkins University)*; Huangjie Zheng (Shanghai Jiao Tong University); Elliot K Fishman (JHMI); Alan Yuille (Johns Hopkins University),\"Medical, Biological and Cell Microscopy\",\"Deep Learning ; Segmentation, Grouping and Shape\",Poster,1.1,214,O,,,\n1195,SIXray: A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images,\"Caijing Miao (University of Chinese Academy of Sciences)*; Lingxi Xie (Johns Hopkins University); Fang Wan (University of Chinese Academy of Sciences); Chi Su (Kingsoft Cloud); Hongye Liu (Kingsoft Cloud); Jianbin Jiao (University of Chinese Academy of Sciences); Qixiang Ye (University of Chinese Academy of Sciences, China)\",Vision Applications and Systems,\"Big Data, Large Scale Methods ; Datasets and Evaluation; Deep Learning\",Poster,1.1,215,O,,,\n5541,Noise2Void - Learning Denoising from Single Noisy Images,Alexander Krull (CSBD/MPI-CBG)*; Tim-Oliver Buchholz (CSBD/MPI-CBG); Florian Jug (CSBD/MPI-CBG),\"Medical, Biological and Cell Microscopy\",Deep Learning ; Low-level Vision; Statistical Learning,Poster,1.1,216,O,,,\n513,Gotta Adapt ’Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild,Luan Tran (Michigan State University)*; Kihyuk Sohn (NEC Laboratories America); Xiang Yu (NEC Labs); Xiaoming Liu (Michigan State University); Manmohan Chandraker (NEC Labs America),Deep Learning,Representation Learning; Vision Applications and Systems,Poster,1.2,1,A,,,\n1438,Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift,Xiang Li (Nanjing University of Science and Technology)*; Shuo Chen (Nanjing University of Science and Technology); Xiaolin Hu (Tsinghua University); Jian Yang (Nanjing University of Science and Technology),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,2,A,,,\n1489,Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation,\"Chunlei Liu (Beihang University); Wenrui Ding (Beihang University); Xin Xia (Beihang University); Baochang Zhang (Beihang University); Jiaxin Gu (Beihang University); Jianzhuang Liu (Noah's Ark Lab, Huawei Technologies Company, Ltd., China); Rongrong Ji (Xiamen University, China)*; David Doermann (University at Buffalo)\",Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,3,A,,,\n1536,DeFusionNET: Defocus Blur Detection via Recurrently Fusing and Refining Multi-scale Deep Features,Chang Tang (China University of Geosciences)*; Lizhe Wang (China University of Geosciences); Albert Zomaya (The University of Sydney); Xinwang Liu (National University of Defense Technology); Xinzhong Zhu (Zhejiang Normal University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding\",Poster,1.2,4,A,,,\n1574,Virtual Networks for Memory Efficient Inference of Multiple Tasks,\"Eunwoo Kim (University of Oxford); Chanho Ahn (Department of ECE and ASRI, Seoul National University); Philip Torr (University of Oxford); Songhwai Oh (Seoul National University)*\",Deep Learning,Representation Learning,Poster,1.2,5,A,,,\n1628,Universal Domain Adaptation,\"Kaichao You (Tsinghua Univ); Zhangjie Cao (Tsinghua University); Mingsheng Long (Tsinghua University)*; Jianmin Wang (\"\"Tsinghua University, China\"\"); Michael Jordan (UC Berkeley)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,6,A,,,\n1673,Improving Transferability of Adversarial Examples with Input Diversity,\"Cihang Xie ( Johns Hopkins University ); Yuyin Zhou (Johns Hopkins University)*; Song Bai (University of Oxford); Zhishuai Zhang (Johns Hopkins University); Jianyu Wang (Baidu Research USA); Zhou Ren (\"\"Snap Inc.\"\"); Alan Yuille (Johns Hopkins University)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Statistical Learning\",Poster,1.2,7,A,,,\n1786,Sequence-to-Sequence Domain Adaptation Network for Robust Text Image Recognition,\"Yaping Zhang (National Laboratory of Pattern Recognition, Institute of Automation, University of Chinese Academy of Sciences, Beijing, China); Shuai Nie (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences); Wenju Liu (National Laboratory of Pattern Recognition, Institute of Automation, University of Chinese Academy of Sciences, Beijing, China)*; Xing Xu (University of Electronic Science and Technology of China); Dongxiang Zhang (University of Electronic Science and Technology of China); Heng Tao Shen (University of Electronic Science and Technology of China (UESTC))\",Deep Learning,\"Document Analysis ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,8,A,,,\n1848,Hybrid-Attention based Decoupled Embedding Learning for Zero-Shot Image Retrieval,Binghui Chen (BUPT)*; Weihong Deng (Beijing University of Posts and Telecommunications),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,1.2,9,A,,,\n1910,Learning to sample,Oren Dovrat (Tel Aviv University)*; Itai Lang (Tel Aviv University); Shai Avidan (Tel Aviv University),Deep Learning,,Poster,1.2,10,A,,,\n1937,Few-Shot Learning via Saliency-guided Hallucination of Samples,\"Hongguang Zhang (Australian National University)*; Jing Zhang (Australian National University); Piotr Koniusz (Data61/CSIRO, ANU)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning; Segmentation, Grouping a\",Poster,1.2,11,A,,,\n1961,Variational Convolutional Neural Network Pruning,Chenglong Zhao (Shanghai Jiao Tong University); Bingbing Ni (Shanghai Jiao Tong University)*; Jian Zhang (Shanghai Jiaotong University); Qiwei Zhao (Shanghai Jiao Tong University); Wenjun Zhang (Shanghai Jiao Tong University); Qi Tian (Huawei Noah’s Ark Lab),Deep Learning,Computer Vision Theory,Poster,1.2,12,A,,,\n1983,Towards Optimal Structured CNN Pruning via Generative Adversarial Learning,\"Shaohui Lin (Xiamen University); Rongrong Ji (Xiamen University, China)*; Chenqian Yan (Xiamen University); Baochang Zhang (Beihang University); Liujuan Cao (Xiamen University); Qixiang Ye (University of Chinese Academy of Sciences, China); Feiyue Huang (Tencent); David Doermann (University at Buffalo)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,13,A,,,\n1989,Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression,\"Yuchao Li (Xiamen University); Shaohui Lin (Xiamen University); Baochang Zhang (Beihang University); Jianzhuang Liu (Noah's Ark Lab, Huawei Technologies Company, Ltd., China); David Doermann (University at Buffalo); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Feiyue Huang (Tencent); Rongrong Ji (Xiamen University, China)*\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,14,A,,,\n2110,Fully Quantized Network for Object Detection,Rundong Li (ShanghaiTech University)*; Feng Liang (Tsinghua University; SenseTime); Hongwei Qin (Sensetime); Yan Wang (SenseTime); Rui Fan (ShanghaiTech University); Junjie Yan (Sensetime Group Limited),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,1.2,15,A,,,\n2124,MnasNet: Platform-Aware Neural Architecture Search for Mobile,Mingxing Tan (Google Brain)*; Bo Chen (Google); Ruoming Pang (Google Brain); Vijay Vasudevan (Google Brain); Mark Sandler (Google); Andrew Howard (Google); Quoc Le (Google Brain),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,16,A,,,\n2205,\"Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More\",Jingwen Ye (Zhejiang University)*; Yixin Ji (Zhejiang University); Xinchao Wang (Stevens Institute of Technology); Kairi Ou (Alibaba); Dapeng Tao (Yunnan University); Mingli Song (Zhejiang University),Deep Learning,Vision Applications and Systems,Poster,1.2,17,A,,,\n16,Joint Discriminative and Generative Learning for Person Re-identification,\"Zhedong Zheng (University of Technology Sydney)*; Xiaodong Yang (NVIDIA Research); Zhiding Yu (NVIDIA); Liang Zheng (Australian National University); Yi Yang (University of Technology, Sydney); Jan Kautz (NVIDIA)\",\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose ; Representation Learning\",Oral,1.2,18,B,,,\n522,Unsupervised Person Re-identification by Soft Multilabel Learning,\"Hong-Xing Yu (Sun Yat-Sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)*; Ancong Wu (Sun Yat-sen University); Xiaowei Guo (Tencent Youtu Lab); Shaogang Gong (Queen Mary University of London); Jian-Huang Lai (Sun Yat-sen University)\",Biometrics,\"Recognition: Detection, Categorization, Retrieval; Video Analytics\",Oral,1.2,19,B,,,\n2262,Learning Context Graph for Person Search,Yichao Yan (Shanghai Jiao Tong University)*; Qiang Zhang (Shanghai Jiao Tong University); Bingbing Ni (Shanghai Jiao Tong University); Wendong Zhang (Shanghai Jiao Tong University); Minghao Xu (Shanghai Jiaotong University); Xiaokang Yang (Shanghai Jiao Tong University of China),\"Recognition: Detection, Categorization, Retrieval\",Visual Reasoning,Oral,1.2,20,B,,,\n220,Gradient Matching Generative Networks for Zero-Shot Learning,Mert Bulent Sariyildiz (Bilkent University)*; Ramazan Gokberk Cinbis (METU),\"Recognition: Detection, Categorization, Retrieval\",Vision + Language,Oral,1.2,21,B,,,\n4499,Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval,\"Sounak Dey (Computer Vision Center)*; Pau Riba (Computer Vision Center); Anjan Dutta (Computer Vision Center); Josep Llados (\"\"Computer Vision Center, Barcelona\"\"); Yi-Zhe Song (Queen Mary University of London)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Vision + Graphics ; Vision Applications and Systems,Oral,1.2,22,B,,,\n5230,Zero-Shot Task Transfer,\"Arghya Pal ( Indian Institute of Technology Hyderabad)*; Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad)\",Representation Learning,Computer Vision Theory; Deep Learning ; Optimization Methods; Vision + Graphics ; Vision Application,Oral,1.2,23,B,,,\n906,C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection,\"Fang Wan (University of Chinese Academy of Sciences)*; Chang Liu (University of Chinese Academy of Sciences); Wei Ke (University of Chinese Academy of Sciences); Xiangyang Ji (Tsinghua University); Jianbin Jiao (University of Chinese Academy of Sciences); Qixiang Ye (University of Chinese Academy of Sciences, China)\",\"Recognition: Detection, Categorization, Retrieval\",Optimization Methods; Others; Statistical Learning; Visual Reasoning,Oral,1.2,24,B,,,\n2973,Learning Inter-pixel Relations for Weakly Supervised Instance Segmentation,Jiwoon Ahn (DGIST); Sunghyun Cho (DGIST); Suha Kwak (POSTECH)*,\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Oral,1.2,25,B,,,\n3916,Attention-based Dropout Layer for Weakly Supervised Object Localization,Junsuk Choe (Yonsei University); Hyunjung Shim (Yonsei University)*,\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Scene Analysis and Understanding,Oral,1.2,26,B,,,\n1019,Domain Generalization by Solving Jigsaw Puzzles,Fabio M. Carlucci (Huawei); Antonio D'Innocente (Sapienza Università di Roma); Silvia Bucci (Italian Institute of Technology); Barbara Caputo (IIT); Tatiana Tommasi (Italian Institute of Technology)*,\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2,27,B,,,\n3628,Transferrable Prototypical Networks for Unsupervised Domain Adaptation,Yingwei Pan (JD AI Research)*; Ting Yao (JD AI Research); Yehao Li (Sun Yat-Sen University); Yu Wang (JD AI Research); Chong-Wah Ngo (City University of Hong Kong); Tao Mei (AI Research of JD.com),\"Recognition: Detection, Categorization, Retrieval\",,Oral,1.2,28,B,,,\n1182,Adversarial Meta-Adaptation Network for Blending-target Domain Adaptation,Ziliang Chen (Sun Yat-sen University)*; Jingyu Zhuang (Sun Yat-sen University); Xiaodan Liang (Sun Yat-sen University); Liang Lin (Sun Yat-sen University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2,29,B,,,\n1113,ELASTIC: Improving CNNs by Instance Specific Scaling Policies,\"Huiyu Wang (Johns Hopkins University)*; Aniruddha Kembhavi (Allen Institute for Artificial Intelligence); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Alan Yuille (Johns Hopkins University); Mohammad Rastegari (Allen Institute for Artificial Intelligence)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2,30,B,,,\n1782,ScratchDet: Training Single-Shot Object Detectors from Scratch,\"Rui Zhu (JD AI Research)*; Shifeng Zhang (CBSR, NLPR, CASIA); Xiaobo Wang (JD AI Research); Longyin Wen (JD Digits); Hailin Shi (JD AI Research); Liefeng Bo (JD Finance); Tao Mei (AI Research of JD.com)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,1.2,31,B,,,\n3294,SFNet: Learning Object-aware Semantic Correspondence,Junghyup Lee (Yonsei University); DOHYUNG KIM (YONSEI UNIVERSITY); Jean Ponce (Inria); Bumsub Ham (Yonsei University)*,\"Recognition: Detection, Categorization, Retrieval\",Motion and Tracking; Scene Analysis and Understanding,Oral,1.2,32,B,,,\n1294,Deep Metric Learning Beyond Binary Supervision,Sungyoun Kim (POSTECH); Minkyo Seo (POSTECH); Ivan Laptev (INRIA Paris); Minsu Cho (POSTECH); Suha Kwak (POSTECH)*,\"Recognition: Detection, Categorization, Retrieval\",,Oral,1.2,33,B,,,\n1510,Learning to Cluster Faces on an Affinity Graph,Lei Yang (The Chinese University of Hong Kong)*; Xiaohang Zhan (The Chinese University of Hong Kong); Dapeng Chen (Sensetime Group Limited); Junjie Yan (Sensetime Group Limited); Chen Change Loy (Nanyang Technological University); Dahua Lin (The Chinese University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",\"Deep Learning ; Face, Gesture, and Body Pose ; Segmentation, Grouping and Shape\",Oral,1.2,34,B,,,\n1610,C2AE: Class Conditioned Auto-Encoder for Open-set Recognition,Poojan B Oza (Johns Hopkins University)*; Vishal Patel (Johns Hopkins University),\"Recognition: Detection, Categorization, Retrieval\",Vision Applications and Systems,Oral,1.2,35,B,,,\n602,K-Nearest Neighbors Hashing,\"Xiangyu He (Institute of Automation, Chinese Academy of Sciences)*; Peisong Wang (Institute of Automation, Chinese Academy of Sciences); Jian Cheng (\"\"Chinese Academy of Sciences, China\"\")\",\"Recognition: Detection, Categorization, Retrieval\",,Poster,1.2,36,B,,,\n1393,Learning RoI Transformer for Oriented Object Detection in Aerial Images,Jian Ding (Wuhan University); Nan Xue (Wuhan University); Yang Long (Wuhan University ); Gui-Song Xia (Wuhan University)*; Qikai Lu (Wuhan University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Vision Applications and Systems,Poster,1.2,37,B,,,\n1429,Snapshot Distillation: Teacher-Student Optimization in One Generation,Chenglin Yang (Johns Hopkins University); Lingxi Xie (Johns Hopkins University)*; Chi Su (Kingsoft Cloud); Alan Yuille (Johns Hopkins University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.2,38,B,,,\n1440,Geometry-Aware Distillation for Indoor Semantic Segmentation,Jianbo Jiao (University of Oxford)*; Yunchao Wei (UIUC); Zequn Jie (Tencent AI Lab); Honghui Shi (IBM | UIUC); Rynson W.H. Lau (City University of Hong Kong); Thomas Huang (UIUC),\"Recognition: Detection, Categorization, Retrieval\",\"Deep Learning ; Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,1.2,39,B,,,\n1455,LiveSketch: Query Perturbations for Guided Sketch-based Visual Search,John Collomosse (University of Surrey)*; Tu Bui (University of Surrey); Hailin Jin (Adobe Research),\"Recognition: Detection, Categorization, Retrieval\",Document Analysis ; Vision Applications and Systems,Poster,1.2,40,B,,,\n1525,Bounding Box Regression with Uncertainty for Accurate Object Detection,Yihui He (Carnegie Mellon University)*; Chenchen Zhu (Carnegie Mellon University); Jianren Wang (Carnegie Mellon University); Marios Savvides (Carnegie Mellon University); Xiangyu Zhang (Megvii Inc),\"Recognition: Detection, Categorization, Retrieval\",,Poster,1.2,41,B,,,\n1528,OCGAN: One-class Novelty Detection Using GANs with Constrained Latent Representations,Pramuditha Perera (Johns Hopkins University)*; Ramesh Nallapati (Amazon); Bing Xiang (Amazon),\"Recognition: Detection, Categorization, Retrieval\",Representation Learning; Vision Applications and Systems,Poster,1.2,42,B,,,\n1651,Learning Metrics from Teachers: Compact Networks for Image Embedding,Lu Yu (CVC)*; Vacit Oguz Yazici (CVC/Wide-Eyes Technologies); Xialei Liu (Computer Vision Center Barcelona); Joost van de Weijer (Computer Vision Center); Yongmei Cheng (NWPU); Arnau Ramisa (Wide-Eyes Technologies),\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Poster,1.2,43,B,,,\n1681,Activity Driven Weakly Supervised Object Detection,Zhenheng Yang (Facebook Research)*; Vignesh Ramanathan (Facebook); Deepti Ghadiyaram (Facebook); Ram Nevatia (U of Southern California); Dhruv Mahajan (Facebook),\"Recognition: Detection, Categorization, Retrieval\",Action Recognition ; Video Analytics,Poster,1.2,44,B,,,\n1738,Separate to Adapt: Open Set Domain Adaptation via Progressive Separation,\"Hong Liu (Tsinghua University); Zhangjie Cao (Tsinghua University); Mingsheng Long (Tsinghua University)*; Jianmin Wang (\"\"Tsinghua University, China\"\"); Qiang Yang (Hong Kong UST)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.2,45,B,,,\n1810,Layout-Graph Reasoning for Fashion Landmark Detection,Weijiang Yu (SUN YAT-SEN UNIVERSITY); Xiaodan Liang (Sun Yat-sen University)*; Ke Gong (Sun Yat-sen University ); ChenHan Jiang (Sun Yat-sen University); Nong Xiao (Sun Yat-sen University); Liang Lin (Sun Yat-sen University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.2,46,B,,,\n1828,DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs,Yang Erkun (Xidian University); Tongliang Liu (The University of Sydney); Cheng Deng (Xidian University)*; Wei Liu (Tencent); Dacheng Tao (The University of Sydney),\"Recognition: Detection, Categorization, Retrieval\",,Poster,1.2,47,B,,,\n1834,Mind Your Neighbours: Image Annotation with Metadata Neighbourhood Graph Co-Attention Networks,\"Junjie Zhang (University of Technology, Sydney); Qi Wu (University of Adelaide)*; Jian Zhang (UTS); Chunhua Shen (University of Adelaide); Jianfeng Lu (Nanjing University of Science and Technology)\",\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Poster,1.2,48,B,,,\n1843,Region Proposal by Guided Anchoring,Kai Chen (The Chinese University of Hong Kong)*; Jiaqi Wang (CUHK); Shuo Yang (Amazon); Chen Change Loy (Nanyang Technological University); Dahua Lin (The Chinese University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",,Poster,1.2,49,B,,,\n1847,Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation,\"Jian Liang (NLPR)*; Ran He (Institute of Automation, Chinese Academy of Sciences); Zhenan Sun (Chinese of Academy of Sciences); Tieniu Tan (NLPR, China)\",\"Recognition: Detection, Categorization, Retrieval\",Statistical Learning,Poster,1.2,50,B,,,\n1855,Learning to Transfer Examples for Partial Domain Adaptation,\"Zhangjie Cao (Tsinghua University); Kaichao You (Tsinghua Univ); Mingsheng Long (Tsinghua University)*; Jianmin Wang (\"\"Tsinghua University, China\"\"); Qiang Yang (Hong Kong UST)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.2,51,B,,,\n1915,Generalized Zero-Shot Recognition based on Visually Semantic Embedding,Pengkai Zhu (Boston University)*; Hanxiao Wang (Boston University); Venkatesh Saligrama (Boston University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Statistical Learning,Poster,1.2,52,B,,,\n1962,Towards Visual Feature Translation,\"Jie Hu (Xiamen University); Rongrong Ji (Xiamen University, China)*; Hong Liu (Xiamen University); ShengChuan Zhang (Xiamen University); Cheng Deng (Xidian University); Qi Tian (Huawei Noah’s Ark Lab)\",\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Poster,1.2,53,B,,,\n1997,Amodal Instance Segmentation through KINS Dataset,Lu Qi (The Chinese University of Hong Kong)*; Li Jiang (The Chinese University of Hong Kong); Shu Liu (Tencent); Xiaoyong Shen (Tencent); Jiaya Jia (Chinese University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",Datasets and Evaluation; Scene Analysis and Understanding,Poster,1.2,54,B,,,\n2016,Global Second-order Pooling Convolutional Networks,Zilin Gao (Dalian University of Technology); Jiangtao Xie (Dalian University of Technology); Qilong Wang (Tianjin University); Peihua Li (Dalian University of Technology)*,\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,1.2,55,B,,,\n2125,Weakly Supervised Complementary Parts Models for Image Classification from the Bottom Up,Weifeng Ge (The University of Hong Kong); Xiangru Lin (The University of Hong Kong); Yizhou Yu (Deepwise AI Lab)*,\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Poster,1.2,56,B,,,\n2139,\"NetTailor: Tuning the architecture, not just the weights\",\"Pedro Morgado (University of California, San Diego)*; Nuno Vasconcelos (UCSD, USA)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,1.2,57,B,,,\n1358,Learning-based Sampling for Natural Image Matting,Jingwei Tang (ETHZ & Disney Research)*; Yagiz Aksoy (ETHZ); Cengiz Oztireli (Disney Research); Markus Gross (ETH Zurich); Tunç Aydin (Disney Research),\"Segmentation, Grouping and Shape\",Low-level Vision,Poster,1.2,58,C,,,\n1437,Learning Unsupervised Video Object Segmentation through Visual Attention,Wenguan Wang (Inception Institute of Artificial Intelligence); Hongmei Song (Beijing Institute of Technology); Shuyang Zhao (Beijing Institute of Technology ); Jianbing Shen (Beijing Institute of Technology)*; Sanyuan Zhao (Beijing Institute of Technology ); Steven Hoi (SMU); Haibin Ling (Temple University),\"Segmentation, Grouping and Shape\",Image and Video Synthesis,Poster,1.2,59,C,,,\n1523,4D Spatio-Temporal ConvNet: Minkowski Convolutional Neural Network,Christopher Choy (Stanford University)*; JunYoung Gwak (Stanford University); Silvio Savarese (Stanford University),\"Segmentation, Grouping and Shape\",3D from Multiview and Sensors; 3D from Single Image; Scene Analysis and Understanding,Poster,1.2,60,C,,,\n1642,Pyramid Feature Selective Network for Saliency detection,\"Ting Zhao (Harbin Institute of Technology, China)*; XIANGQIAN WU (Harbin Institute of Technology, China)\",\"Segmentation, Grouping and Shape\",\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,61,C,,,\n1682,Co-saliency Detection via Mask-guided Fully Convolutional Networks with Multi-scale Label Smoothing,Kaihua Zhang (NUIST)*; Tengpeng Li (NUIST); Bo Liu (Rutgers University); Qingshan Liu (Nanjing University of Information Science & Technology),\"Segmentation, Grouping and Shape\",Deep Learning,Poster,1.2,62,C,,,\n1750,SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation - A Synthetic Dataset and Baselines,Yuan-Ting Hu (University of Illinois at Urbana-Champaign)*; Hong-Shuo Chen (UIUC); Kexin Hui (UIUC); Jia-Bin Huang (Virginia Tech); Alexander Schwing (UIUC),\"Segmentation, Grouping and Shape\",,Poster,1.2,63,C,,,\n1798,Learning Instance Activation Maps for Weakly Supervised Instance Segmentation,\"Yi Zhu (University of Chinese Academy of Sciences)*; Yanzhao Zhou (University of Chinese Academy of Sciences); Huijuan Xu (University of California, Berkeley); Qixiang Ye (University of Chinese Academy of Sciences, China); David Doermann (University at Buffalo); Jianbin Jiao (University of Chinese Academy of Sciences)\",\"Segmentation, Grouping and Shape\",\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,64,C,,,\n1970,Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation,Zhi Tian (The University of Adelaide)*; Tong He (The University of Adelaide); Chunhua Shen (University of Adelaide); Youliang Yan (Huawei),\"Segmentation, Grouping and Shape\",Deep Learning,Poster,1.2,65,C,,,\n1998,Box-driven Class-wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation,\"Chunfeng Song (CASIA, UCAS, CRIPAC)*; Yan Huang (Institute of Automation, Chinese Academy of Sciences); Wanli Ouyang (The University of Sydney); Liang Wang (NLPR, China)\",\"Segmentation, Grouping and Shape\",Deep Learning ; Low-level Vision; Scene Analysis and Understanding; Vision + Graphics,Poster,1.2,66,C,,,\n2137,Dual Attention Network for Scene Segmentation,\"Jun Fu (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences and University of Chinese Academy of Sciences)*; Jing Liu (National Lab of Pattern Recognition, Institute of Automation,Chinese Academy of Sciences); Haijie Tian (BIT); Yong Li (Business Growth BU, JD.com); Yongjun Bao (JD.com); Zhiwei Fang (National Lab of Pattern Recognition, Institute of Automation,Chinese Academy of Sciences;University of Chinese Academy of Sciences ); Hanqing Lu (NLPR, Institute of Automation, CAS)\",\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Poster,1.2,67,C,,,\n1444,InverseRenderNet: Learning single image inverse rendering,Ye Yu (University of York)*; William Smith (University of York),Physics-based Vision and Shape-from-X,3D from Single Image; Deep Learning,Poster,1.2,68,D,,,\n1470,A Variational Auto-Encoder Model for Stochastic Point Processes,Nazanin Mehrasa (Simon Fraser University)*; Akash Abdu Jyothi (Simon Fraser University); Thibaut Durand (Simon Fraser University); Jiawei He (Simon Fraser University); Leonid Sigal (University of British Columbia); Greg Mori (Simon Fraser University),Statistical Learning,,Poster,1.2,69,D,,,\n1558,Unifying Heterogeneous Classifiers with Distillation,Jayakorn Vongkulbhisal (IBM Research)*; Phongtharin Vinayavekhin (IBM Research); Marco Visentini-Scarzanella (IBM Research),Statistical Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,70,D,,,\n1606,Assessment of Faster-RCNN in Man-Machine collaborative search,Arturo Deza (Harvard University)*; Amit Surana (United Technologies Resesarch Center); Miguel Eckstein (UCSB),Datasets and Evaluation,,Poster,1.2,71,D,,,\n1758,OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge,\"Kenneth Marino (Carnegie Mellon University)*; Mohammad Rastegari (Allen Institute for Artificial Intelligence); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Roozbeh Mottaghi (Allen Institute for AI)\",Datasets and Evaluation,Vision + Language,Poster,1.2,72,D,,,\n1835,NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction,Yuan Gao (Tencent AI Lab)*; Jiayi Ma (Wuhan University); Mingbo Zhao (Donghua University); Alan Yuille (Johns Hopkins University); Wei Liu (Tencent),Statistical Learning,Representation Learning,Poster,1.2,73,D,,,\n1932,Spectral Metric for Dataset Complexity Assessment,\"Frederic Branchaud-Charron (Universite de Sherbrooke)*; Pierre-Marc Jodoin (Universite de Sherbrooke); Andrew Achkar (Miovision Technologies Inc., Canada)\",Datasets and Evaluation,\"Computer Vision Theory; Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,74,D,,,\n1994,ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding,\"Ning Liu (Sun Yat-sen University )*; Yongchao Long (Sun Yat-sen University ); Changqing Zou (University of Maryland (UMD)); Qun Niu (Sun Yat-sen University); Li Pan (Shanghai Jiaotong University, China); Hefeng Wu (Guangdong University of Foreign Studies)\",Statistical Learning,Deep Learning,Poster,1.2,75,D,,,\n2161,Feature Distance Adversarial Network for Vehicle Re-Identification,Yihang Lou (Peking University)*; YAN BAI (Peking University); Jun Liu (Nanyang Technological University); Shiqi Wang (CityU); Lingyu Duan (Peking University),Datasets and Evaluation,Representation Learning; Vision Applications and Systems,Poster,1.2,76,D,,,\n109,3D Local Features for Direct Pairwise Registration,Tolga Birdal (TU Munich)*; Haowen Deng (Technical University of Munich); Slobodan Ilic (TUM),3D from Multiview and Sensors,\"Recognition: Detection, Categorization, Retrieval; Robotics + Driving\",Poster,1.2,77,E,,,\n1396,SPLFlowNet: Sparse Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds,\"Xiuye Gu (Stanford University)*; Chongruo Wu (UC Davis); Yijie Wang (TuSimple); Panqu Wang (TuSimple); Yong Jae Lee (University of California, Davis)\",3D from Multiview and Sensors,Deep Learning ; Low-level Vision; Motion and Tracking; Robotics + Driving; Scene Analysis and Unders,Poster,1.2,78,E,,,\n1409,GPSfM: Global Projective SFM Using Algebraic Constraints\\\\ on Multi-View Fundamental Matrices,Yoni Kasten (Weizmann Institute )*; Amnon Geifman (Weizmann Institute); Meirav Galun (Weizmann Institute of Science); Ronen Basri (Weizmann Institute of Science),3D from Multiview and Sensors,,Poster,1.2,79,E,,,\n1410,Group-wise Correlation Stereo Network,\"Xiaoyang Guo (The Chinese University of Hong Kong)*; Kai Yang (SenseTime Research); Wukui Yang (SenseTime Research); Hongsheng Li (Chinese University of Hong Kong); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",3D from Multiview and Sensors,,Poster,1.2,80,E,,,\n1562,Multi-Level Context Ultra-Aggregation for Stereo Matching,Guang-Yu Nie (Beijing Institute of Technology); Ming-Ming Cheng (Nankai University); Yun Liu (Nankai University); Zhengfa Liang (Southwest Electronics and Telecommunication Technology Research Institute); Deng-Ping Fan (Nankai University); Yue Liu (Beijing Institute of Technology)*; Yongtian Wang (Beijing Institute of Technology),3D from Multiview and Sensors,Deep Learning ; RGBD sensors and analytics; Robotics + Driving; Scene Analysis and Understanding,Poster,1.2,81,E,,,\n1781,\"Large-Scale, Metric Structure from Motion for Unordered Light Fields\",Sotiris Nousias (University College London)*; Manolis Lourakis (FORTH - Hellas); Christos Bergeles (Kings College London),3D from Multiview and Sensors,,Poster,1.2,82,E,,,\n1887,Understanding the Limitations of CNN-based Absolute Camera Pose Regression,Torsten Sattler (Chalmers University of Technology)*; Qunjie Zhou (Technical University of Munich); Marc Pollefeys (ETH Zurich / Microsoft); Laura Leal-Taixé (TUM),3D from Multiview and Sensors,\"Computer Vision Theory; Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,83,E,,,\n1899,DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene from Sparse LiDAR Data and Single Color Image,Jiaxiong Qiu (UESTC); Zhaopeng Cui (ETH Zurich)*; Yinda Zhang (Princeton University); xingdi zhang (UESTC); Shuaicheng Liu (UESTC; Megvii); Bing Zeng (University of Electronic Science and Technology of China); Marc Pollefeys (ETH Zurich / Microsoft),3D from Multiview and Sensors,Deep Learning ; Robotics + Driving,Poster,1.2,84,E,,,\n2003,Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling,Jiancheng YANG (Shanghai Jiao Tong University); Qiang Zhang (Shanghai Jiao Tong University); Bingbing Ni (Shanghai Jiao Tong University)*; Linguo Li (Shanghai Jiao Tong University); Jinxian Liu (Shanghai Jiao Tong University); Mengdie Zhou (Shanghai Jiao Tong University); Qi Tian (Huawei Noah’s Ark Lab),3D from Multiview and Sensors,Deep Learning,Poster,1.2,85,E,,,\n2043,Learning with Batch-wise Optimal Transport Loss for 3D Shape Recognition,Lin Xu (Institute of Advanced Artificial Intelligence (IAAI))*; Han Sun (Institute of Advanced Artificial Intelligence (IAAI)); Yuai Liu (Institute of Advanced Artificial Intelligence (IAAI)),3D from Multiview and Sensors,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,86,E,,,\n2066,DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion,Chen Wang (Shanghai Jiao Tong University)*; Danfei Xu (Stanford University); Yuke Zhu (Stanford University); Roberto Martín-Martín (Stanford University); Cewu Lu (Shanghai Jiao Tong University); Li Fei-Fei (Stanford University); Silvio Savarese (Stanford University),3D from Multiview and Sensors,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; RGBD sensors and analytics; Robot\",Poster,1.2,87,E,,,\n1365,Dense Depth Posterior (DDP) from Single Image and Sparse Range,\"Yanchao Yang (UCLA)*; Alex Wong (University of California, Los Angeles); Stefano Soatto (UCLA)\",3D from Single Image,3D from Multiview and Sensors; Robotics + Driving; Scene Analysis and Understanding,Poster,1.2,88,F,,,\n1441,DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama,Shang-Ta Yang (National Tsing Hua University)*; Fu-En Wang (National Tsing Hua University); Chi-Han Peng (KAUST); Peter Wonka (KAUST); Hung-Kuo Chu (National Tsing Hua University); Min Sun (NTHU),3D from Single Image,Deep Learning,Poster,1.2,89,F,,,\n1638,Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach,Amir Atapour-Abarghouei (Durham University)*; Toby Breckon (Durham University),3D from Single Image,\"Deep Learning ; RGBD sensors and analytics; Segmentation, Grouping and Shape\",Poster,1.2,90,F,,,\n1717,Segmentation-driven 6D Object Pose Estimation,\"Yinlin Hu (EPFL)*; Joachim Hugonot (EPFL); Pascal Fua (EPFL, Switzerland); Mathieu Salzmann (EPFL)\",3D from Single Image,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,91,F,,,\n1884,Exploiting temporal context for 3D human pose estimation in the wild,Anurag Arnab (University of Oxford)*; Carl Doersch (DeepMind); Andrew Zisserman (University of Oxford),3D from Single Image,\"3D from Multiview and Sensors; Big Data, Large Scale Methods ; Face, Gesture, and Body Pose\",Poster,1.2,92,F,,,\n2029,What Do Single-view 3D Reconstruction Networks Learn?,Maxim Tatarchenko (Freiburg)*; Stephan R Richter (Intel Labs); Rene Ranftl (Intel Labs); Zhuwen Li (Pony AI); Vladlen Koltun (Intel Labs); Thomas Brox (University of Freiburg),3D from Single Image,Deep Learning,Poster,1.2,93,F,,,\n391,UniformFace: Learning Deep Equidistributed Representation for Face Recognition,Yueqi Duan (Tsinghua University); Jiwen Lu (Tsinghua University)*; Jie Zhou (Tsinghua University),\"Face, Gesture, and Body Pose\",,Poster,1.2,94,G,,,\n1418,Semantic Graph Convolutional Networks for 3D Human Pose Regression,Long Zhao (Rutgers University)*; Xi Peng (Binghamton University); Yu Tian (Rutgers); Mubbasir Kapadia (Rutgers University); Dimitris Metaxas (Rutgers),\"Face, Gesture, and Body Pose\",3D from Single Image; Vision Applications and Systems,Poster,1.2,95,G,,,\n1427,Mask-Guided Portrait Editing with Conditional GANs,shuyang V Gu (University of Science and Technology of China)*; Jianmin Bao (University of Science and Technology of China); Hao Yang (Microsoft Research Asia); Dong Chen (Microsoft Research Asia); Fang Wen (Microsoft Research Asia ); Lu Yuan (Microsoft),\"Face, Gesture, and Body Pose\",Image and Video Synthesis,Poster,1.2,96,G,,,\n1477,Group Sampling Networks for Scale Invariant Face Detection,Xiang Ming (Xi'an Jiaotong University)*; Fangyun Wei (Microsoft Research Asia); Ting Zhang (MSRA); Dong Chen (Microsoft Research Asia); Fang Wen (Microsoft Research Asia ),\"Face, Gesture, and Body Pose\",\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,1.2,97,G,,,\n1632,Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation,\"Yong Zhang (Tencent AI Lab); Baoyuan Wu (Tencent AI Lab)*; Weiming Dong (NLPR, Institute of Automation, Chinese Academy of Sciences); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent); Bao-Gang Hu (Institute of Automation, Chinese Academy of Sciences); Qiang Ji (Rensselaer Polytechnic Institute)\",\"Face, Gesture, and Body Pose\",Biometrics,Poster,1.2,98,G,,,\n1692,Semantic Alignment: Finding Semantically Consistent Ground-truth for Facial Landmark Detection,\"Zhiwei Liu ( Institute of Automation Chinese Academy of Sciences); Xiangyu Zhu (NLPR)*; Guosheng Hu (AnyVision); Haiyun Guo (CASIA); Ming Tang (Chinese Academy of Sciences, China); Zhen Lei (NLPR, CASIA, China); Neil Robertson (Queen's University Belfast); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences)\",\"Face, Gesture, and Body Pose\",Biometrics,Poster,1.2,99,G,,,\n1725,LAEO-Net: revisiting people Looking At Each Other in videos,Manuel J. Marín-Jiménez (University of Córdoba)*; Vicky Kalogeiton (University of Oxford); Pablo Medina Suárez (University of Cordoba); Andrew Zisserman (University of Oxford),\"Face, Gesture, and Body Pose\",Action Recognition ; Deep Learning ; Video Analytics; Vision Applications and Systems,Poster,1.2,100,G,,,\n1730,Robust Facial Landmark Detection via Occlusion-adaptive Deep Networks,\"Meilu Zhu (College of Computer Science and Software Engineering, Shenzhen University); Mingjie Zheng (College of Computer Science and Software Engineering, Shenzhen University); Daming Shi (College of Computer Science and Software Engineering, Shenzhen University)*; Muhammad Sadiq (College of Computer Science and Software Engineering, Shenzhen University)\",\"Face, Gesture, and Body Pose\",Deep Learning,Poster,1.2,101,G,,,\n1766,Learning Individual Styles of Conversational Gesture,Shiry Ginosar (UC Berkeley)*; Amir Bar (Zebra Medical Vision); Gefen Kohavi (UC Berkeley); Caroline M Chan (MIT); Andrew Owens (UC Berkeley); Jitendra Malik (University of California at Berkley),\"Face, Gesture, and Body Pose\",Datasets and Evaluation; Deep Learning ; Image and Video Synthesis,Poster,1.2,102,G,,,\n1863,\"Face Anti-Spoofing: Model Matters, So Does Data\",Xiao Yang (Tencent AI Lab; SJTU); Wenhan Luo (Tencent AI Lab); Linchao Bao (Tencent AI Lab); Yuan Gao (Tencent AI Lab); dihong gong (Tencent AI Lab); Shibao Zheng (SJTU); Zhifeng Li (Tencent AI Lab)*; Wei Liu (Tencent),\"Face, Gesture, and Body Pose\",Biometrics,Poster,1.2,103,G,,,\n1870,Fast Human Pose Estimation,Feng Zhang (University of Electronic Science and Technology of China); Xiatian Zhu (Vision Semantics Limited); Mao Ye (University of Electronic Science and Technology of China)*,\"Face, Gesture, and Body Pose\",\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,104,G,,,\n2019,Decorrelated Adversarial Learning for Age-Invariant Face Recognition,Hao Wang (Tencent AI Lab)*; dihong gong (Tencent AI Lab); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent),\"Face, Gesture, and Body Pose\",\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,1.2,105,G,,,\n1360,Cross-task weakly supervised learning from instructional videos,Dmitry Zhukov (Inria)*; Jean-Baptiste Alayrac (DeepMind); Ramazan Gokberk Cinbis (METU); David Fouhey (University of Michigan); Ivan Laptev (INRIA Paris); Josef Sivic (INRIA),Action Recognition,,Poster,1.2,106,H,,,\n1466,D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation,Chien-Yi Chang (Stanford University)*; De-An Huang (Stanford University); Yanan Sui (Stanford University); Li Fei-Fei (Stanford University); Juan Carlos Niebles (Stanford University),Video Analytics,,Poster,1.2,107,H,,,\n1584,Progressive Teacher-student Learning for Early Action Prediction,\"Xionghui Wang (Sun Yat-sen University, China); Jian-Fang HU (Sun Yat-sen University)*; WEI-SHI ZHENG (Sun Yat-sen University, China); Jianguo Zhang (University of Dundee); Jian-Huang Lai (Sun Yat-sen University)\",Action Recognition,Deep Learning ; Video Analytics; Vision Applications and Systems,Poster,1.2,108,H,,,\n1711,Social Relation Recognition from Videos via Multi-scale Spatial-Temporal Reasoning,Xinchen Liu (JD.com)*; Wu Liu (AI Research of JD.com); Meng Zhang (Beijing University of Posts and Telecommunications); Jingwen Chen (Hangzhou Dianzi University); Lianli Gao (The University of Electronic Science and Technology of China); Chenggang Yan (Hangzhou Dianzi University); Tao Mei (JD.com),Video Analytics,Visual Reasoning,Poster,1.2,109,H,,,\n1726,MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation,Yazan Abu Farha (University of Bonn)*; Jürgen Gall (University of Bonn),Action Recognition,,Poster,1.2,110,H,,,\n1759,Transferable Interactiveness Prior for Human-Object Interaction Detection,Yong-Lu Li (Shanghai Jiao Tong University); Siyuan Zhou (Shanghai Jiao Tong University); Xijie Huang (Shanghai Jiao Tong University); Liang Xu (Shanghai Jiao Tong University); Ze Ma (SJTU); Hao-Shu Fang (SJTU); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University); Cewu Lu (Shanghai Jiao Tong University)*,Action Recognition,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,111,H,,,\n1911,Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition,\"Maosen Li (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University)*; Siheng Chen (Carnegie Mellon University); Xu Chen (Cooperative Medianet Innovation Center, Shanghai Jiaotong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University); Qi Tian (Huawei Noah’s Ark Lab)\",Action Recognition,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,112,H,,,\n1995,Multi-granularity Generator for Temporal Action Proposal,yuan liu (Southeast University)*; Lin Ma (Tencent AI Lab); Yifeng Zhang (Southeast University); Wei Liu (Tencent); Shih-Fu Chang (Columbia University),Video Analytics,Action Recognition,Poster,1.2,113,H,,,\n1421,Deep Structured Scene Flow,\"Wei-Chiu Ma (MIT)*; Shenlong Wang (Uber ATG, University of Toronto); Rui Hu (Uber); Yuwen Xiong (Uber ATG; University of Toronto); Raquel Urtasun (Uber ATG)\",Motion and Tracking,Low-level Vision; Robotics + Driving,Poster,1.2,114,I,,,\n1570,\"See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks\",Xiankai Lu (Inception Institute of Artificial Intelligence); Wenguan Wang (Inception Institute of Artificial Intelligence); Chao Ma (Shanghai Jiao Tong University); Jianbing Shen (Beijing Institute of Technology)*; Ling Shao (Inception Institute of Artificial Intelligence); Fatih Porikli (ANU),Motion and Tracking,Video Analytics; Vision Applications and Systems,Poster,1.2,115,I,,,\n1894,Patch Based Discriminative Feature Learning for Unsupervised Person Re-identification,\"Qize Yang (Sun Yat-sen University); Hong-Xing Yu (Sun Yat-Sen University); Ancong Wu (Sun Yat-sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)*\",Biometrics,\"Recognition: Detection, Categorization, Retrieval; Video Analytics\",Poster,1.2,116,I,,,\n2104,SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking,Guangting Wang (University of Science and Technology of China); Chong Luo (Microsoft Research Asia)*; Zhiwei Xiong (University of Science and Technology of China); Wenjun Zeng (Microsoft Research),Motion and Tracking,,Poster,1.2,117,I,,,\n1426,Shapes and Context: In-the-wild Image Synthesis & Manipulation,Aayush Bansal (Carnegie Mellon University)*; Yaser Sheikh (CMU); Deva Ramanan (Carnegie Mellon University),Image and Video Synthesis,\"Big Data, Large Scale Methods ; Segmentation, Grouping and Shape; Vision + Graphics\",Oral,1.2,118,J,,,\n462,Semantics Disentangling for Text-to-Image Generation,\"Guojun Yin (University of Science and Technology of China); Bin Liu (University of Science and Technology of China); Lu Sheng (The Chinese University of Hong Kong)*; Nenghai Yu (University of Science and Technology of China); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Jing Shao (Sensetime)\",Image and Video Synthesis,Vision + Language,Oral,1.2,119,J,,,\n2072,Semantic Image Synthesis with Spatially-Adaptive Normalization,Taesung Park (UC Berkeley)*; Ming-Yu Liu (NVIDIA); Ting-Chun Wang (NVIDIA); Jun-Yan Zhu (MIT),Image and Video Synthesis,Computational Photography; Deep Learning ; Vision + Graphics,Oral,1.2,120,J,,,\n609,Progressive Pose Attention Transfer for Person Image Generation,Zhen Zhu (Huazhong University of Science and Technology)*; Tengteng Huang (Huazhong University of Science and Technology); Baoguang Shi (Microsoft); Miao Yu (Huazhong University of Science and Technology); Bofei Wang (ZTE Corporation); Xiang Bai (Huazhong University of Science and Technology),Image and Video Synthesis,,Oral,1.2,121,J,,,\n3269,Unsupervised Person Image Generation with Semantic Parsing Transformation,Sijie Song (Peking University)*; Wei Zhang (JD AI Research); Jiaying Liu (Peking University); Tao Mei (AI Research of JD.com),Image and Video Synthesis,Vision Applications and Systems,Oral,1.2,122,J,,,\n2439,DeepView: View synthesis with learned gradient descent,John P Flynn (Google Inc)*; Michael Broxton (Google); Paul E Debevec (Google VR); Graham Fyffe (Google Inc.); Ryan S. Overbeck (Google Inc.); Noah Snavely (Google); Richard Tucker (Google); Matthew DuVall (Google),Image and Video Synthesis,3D from Multiview and Sensors; Computational Photography; Deep Learning ; Optimization Methods,Oral,1.2,123,J,,,\n4908,Animating Arbitrary Objects via Deep Motion Transfer,Aliaksandr Siarohin (University of Trento)*; Stéphane Lathuiliere (university of Trento); Sergey Tulyakov (Snap Inc); Elisa Ricci (FBK - Technologies of Vision); Nicu Sebe (University of Trento),Image and Video Synthesis,Deep Learning,Oral,1.2,124,J,,,\n5428,Textured Neural Avatars,Aliaksandra Shysheya (Samsung); Egor Zakharov (Skoltech); Renat Bashirov (Samsung); Igor Pasechnik (Samsung); Egor Burkov (Skoltech); Dmitry Ulyanov (Skoltech); Yury Malkov (Samsung); Karim Iskakov (Samsung); Kara-Ali Aliev (Samsung); Alexey Ivakhnenko (Samsung); Alexander Vakhitov (Samsung AI Research Center); Victor Lempitsky (Samsung)*,Image and Video Synthesis,Deep Learning ; Vision + Graphics ; Vision Applications and Systems,Oral,1.2,125,J,,,\n3190,IM-Net for High Resolution Video Frame Interpolation,Tomer Peleg (Samsung Israel R&D Center)*; Pablo Szekely (Samsung Israel R&D Center); Doron Sabo (Samsung Israel R&D Center); Omry Sendik (Samsung Israel R&D Center),Image and Video Synthesis,Datasets and Evaluation; Deep Learning ; Low-level Vision; Motion and Tracking; Vision Applications,Oral,1.2,126,J,,,\n1240,Homomorphic Latent Space Interpolation for Unpaired Image-to-image Translation,Yingcong Chen (Chinese University of Hong Kong)*; Xiaogang XU (The Chinese University of Hong Kong); Zhuotao Tian (Chinese University of Hong Kong); Jiaya Jia (Chinese University of Hong Kong),Image and Video Synthesis,\"Face, Gesture, and Body Pose ; Vision Applications and Systems\",Oral,1.2,127,J,,,\n3069,Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation,Hao Tang (University of Trento)*; Dan Xu (University of Oxford); Yan Yan (Texas State University); Yanzhi Wang (Northeastern University); Jason J Corso (University of Michigan); Nicu Sebe (University of Trento),Image and Video Synthesis,,Oral,1.2,128,J,,,\n4341,Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping,\"Huan Fu (The University of Sydney)*; Mingming Gong (University of Pittsburgh); Chaohui Wang (Laboratoire d'Informatique Gaspard Monge, Université Paris-Est); Kayhan Batmanghelich (University of Pittsburgh / Carnegie Mellon University); Kun Zhang (Carnegie Mellon University); Dacheng Tao (University of Sydney)\",Image and Video Synthesis,Deep Learning,Oral,1.2,129,J,,,\n3521,DeepVoxels: Learning Persistent 3D Feature Embeddings,Vincent Sitzmann (Stanford University)*; Justus Thies (Technical University of Munich); Felix Heide (Princeton University); Matthias Niessner (Technical University of Munich); Gordon Wetzstein (Stanford University); Michael Zollhoefer (Stanford University),Image and Video Synthesis,Deep Learning,Oral,1.2,130,J,,,\n5944,Inverse Path Tracing for Joint Material and Lighting Estimation,Dejan Azinovic (Technical University of Munich)*; Tzu-Mao Li (MIT CSAIL); Matthias Niessner (Technical University of Munich); Anton Kaplanyan (Facebook Reality Labs),Image and Video Synthesis,Computational Photography; Computer Vision Theory; Optimization Methods; Scene Analysis and Understa,Oral,1.2,131,J,,,\n4057,The Visual Centrifuge: Model-Free Layered Video Representations,Jean-Baptiste Alayrac (DeepMind); Joao Carreira (DeepMind)*; Andrew Zisserman (University of Oxford),Image and Video Synthesis,Computational Photography; Deep Learning ; Representation Learning; Scene Analysis and Understanding,Oral,1.2,132,J,,,\n5720,Label-Noise Robust Generative Adversarial Networks,Takuhiro Kaneko (The University of Tokyo)*; Yoshitaka Ushiku (The University of Tokyo); Tatsuya Harada (The University of Tokyo),Image and Video Synthesis,Deep Learning ; Representation Learning,Oral,1.2,133,J,,,\n5766,DLOW: Domain Flow for Adaptation and Generalization,Wen Li (ETH Zurich)*; RUI GONG (ETH Zurich); Yuhua Chen (ETH Zurich); Luc Van Gool (ETH Zurich),Image and Video Synthesis,Deep Learning ; Scene Analysis and Understanding,Oral,1.2,134,J,,,\n6970,CollaGAN: Collaborative GAN for Missing Image Data Imputation,\"Dongwook Lee (Korea Advanced Institute of Science and Technology)*; Junyoung Kim (Korea Advanced Institute of Science and Technology); Won-Jin Moon (Konkuk University Medical Center); Jong Chul Ye (\"\"Department of Bio and Brain Engineering, KAIST, Korea\"\")\",Image and Video Synthesis,\"Deep Learning ; Face, Gesture, and Body Pose ; Medical, Biological and Cell Microscopy\",Oral,1.2,135,J,,,\n194,Spatial Fusion GAN for Image Synthesis,\"Fangneng Zhan (Nanyang Technological University); Hongyuan Zhu (Institute for Infocomm, Research Agency for Science, Technology and Research (A*STAR) Singapore); Shijian Lu (Nanyang Technological University)*\",Image and Video Synthesis,Deep Learning ; Document Analysis,Poster,1.2,136,J,,,\n1347,Text Guided Person Image Synthesis,Xingran Zhou (Zhejiang University); Siyu Huang (Zhejiang University)*; Bin Li (Zhejiang University); Yingming Li (Zhejiang University); Jiachen Li (Nanjing University); Zhongfei Zhang (Zhejiang University),Image and Video Synthesis,\"Face, Gesture, and Body Pose ; Vision + Language\",Poster,1.2,137,J,,,\n1439,STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing,\"Ming Liu (Harbin Institute of Technology, China); Yukang Ding (Baidu Research); Min Xia (Harbin Institute of Technology, China); Xiao Liu (Baidu); Errui Ding (Baidu Inc.); Wangmeng Zuo (Harbin Institute of Technology, China)*; Shilei Wen (Baidu Research)\",Image and Video Synthesis,\"Face, Gesture, and Body Pose\",Poster,1.2,138,J,,,\n1453,Towards Instance-level Image-to-Image Translation,Zhiqiang Shen (UIUC)*; Jianping Shi (Sensetime Group Limited); Mingyang Huang (Sensetime Group Limited); Xiangyang Xue (Fudan University); Thomas Huang (UIUC),Image and Video Synthesis,Deep Learning,Poster,1.2,139,J,,,\n1637,Dense Intrinsic Appearance Flow for Human Pose Transfer,Yining Li (Chinese University of Hong Kong)*; Chen Huang (Carnegie Mellon University); Chen Change Loy (Nanyang Technological University),Image and Video Synthesis,\"Face, Gesture, and Body Pose\",Poster,1.2,140,J,,,\n1769,Depth-Aware Video Frame Interpolation,\"Wenbo Bao (Shanghai Jiao Tong University)*; Wei-Sheng Lai (University of California, Merced); Chao Ma (Shanghai Jiao Tong University); Xiaoyun Zhang (Shanghai Jiao Tong University); Zhiyong Gao (Shanghai Jiao Tong University); Ming-Hsuan Yang (University of California at Merced)\",Image and Video Synthesis,Low-level Vision; Motion and Tracking; Scene Analysis and Understanding,Poster,1.2,141,J,,,\n1891,Sliced Wasserstein Generative Models,Jiqing Wu (ETH Zurich); Zhiwu Huang (ETH Zurich); Dinesh Acharya (ETH Zurich)*; Wen Li (ETH Zurich); Janine D Thoma (ETH Zurich); Danda Pani Paudel (ETH Zürich); Luc Van Gool (ETH Zurich),Image and Video Synthesis,Deep Learning ; Optimization Methods,Poster,1.2,142,J,,,\n1892,Deep Flow Guided Video Inpainting,Rui Xu (CUHK)*; Xiaoxiao Li (The Chinese University of Hong Kong); Bolei Zhou (CUHK); Chen Change Loy (Nanyang Technological University),Image and Video Synthesis,Low-level Vision,Poster,1.2,143,J,,,\n1959,Video Generation from Single Semantic Label Map,\"Junting Pan (Sensetime); Chengyu Wang (SenseTime Research); Xu Jia (Huawei Noah's Ark Lab); Jing Shao (Sensetime); Lu Sheng (The Chinese University of Hong Kong)*; Junjie Yan (Sensetime Group Limited); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",Image and Video Synthesis,Deep Learning ; Motion and Tracking; Scene Analysis and Understanding,Poster,1.2,144,J,,,\n50,Polarimetric Camera Calibration Using an LCD Monitor,Zhixiang Wang (National Taiwan University); Yinqiang Zheng (National Institute of Informatics)*; Yung-Yu Chuang (National Taiwan University),Computational Photography,3D from Multiview and Sensors; Motion and Tracking; Physics-based Vision and Shape-from-X,Poster,1.2,145,K,,,\n1399,Fully Automatic Video Colorization with Self Regularization and Diversity,Qifeng Chen (HKUST)*; Chenyang Lei (HKUST),Vision + Graphics,Computational Photography; Image and Video Synthesis; Low-level Vision,Poster,1.2,146,K,,,\n1597,\"Zoom to Learn, Learn to Zoom\",Xuaner Zhang (UC Berkeley)*; Qifeng Chen (HKUST); Ren Ng (UC Berkeley); Vladlen Koltun (Intel Labs),Computational Photography,Deep Learning ; Low-level Vision,Poster,1.2,147,K,,,\n1787,Single Image Reflection Removal Beyond Linearity,Qiang Wen (South China University of Technology); Yinjie Tan (South China University of Technology); Jing Qin (The Hong Kong Polytechnic University); Wenxi Liu (Fuzhou University); Guoqiang Han (South China University of Technology); Shengfeng He (South China University of Technology)*,Vision + Graphics,Low-level Vision,Poster,1.2,148,K,,,\n1931,Learning to Separate Multiple Illuminants in a Single Image,Zhuo Hui (Carnegie Mellon University)*; Ayan Chakrabarti (Washington University in St. Louis); Kalyan Sunkavalli (Adobe Research); Aswin Sankaranarayanan (Carnegie Mellon University),Computational Photography,Deep Learning ; Low-level Vision,Poster,1.2,149,K,,,\n2050,Shape Unicode: A Unified Shape Representation,Sanjeev Muralikrishnan (Adobe)*; Vladimir Kim (Adobe); Matthew Fisher (Adobe Research); Siddhartha Chaudhuri (Adobe Research),Vision + Graphics,Representation Learning,Poster,1.2,150,K,,,\n2119,Robust Video Stabilization by Optimization in CNN Weight Space,Jiyang Yu (University of California San Diego)*; Ravi Ramamoorthi (University of California San Diego),Computational Photography,Image and Video Synthesis,Poster,1.2,151,K,,,\n99,Learning Linear Transformations for Fast Image and Video Style Transfer,\"Xueting Li (University of California, Merced)*; Sifei Liu (NVIDIA); Jan Kautz (NVIDIA); Ming-Hsuan Yang (University of California at Merced)\",Low-level Vision,Deep Learning,Poster,1.2,152,L,,,\n1370,Local detection of stereo occlusion boundaries,Jialiang Wang (Harvard University)*; Todd Zickler (Harvard University),Low-level Vision,\"3D from Multiview and Sensors; Physics-based Vision and Shape-from-X; Segmentation, Grouping and Sha\",Poster,1.2,153,L,,,\n1532,Bi-Directional Cascade Network for Perceptual Edge Detection,\"Jianzhong He (Peking University); Shiliang Zhang (Beijing University)*; Ming Yang (Horizon Robotics); Yanu Shan (Beijing Horizon information Technology Co.,ltd); Tiejun Huang (Peking University)\",Low-level Vision,\"Scene Analysis and Understanding; Segmentation, Grouping and Shape; Visual Reasoning\",Poster,1.2,154,L,,,\n1554,Single Image Deraining: A Comprehensive Benchmark Analysis,\"siyuan li (Tianjin university)*; Iago Breno A. do C. Araujo (USP); Wenqi Ren (Institute of Information Engineering, Chinese Academy of Sciences); Zhangyang Wang (TAMU); Eric K. Tokuda (Usp); Roberto Hirata Junior (USP); Roberto Cesar-Junior (Usp); Jiawan Zhang (Tianjin University); Xiaojie Guo (Tianjin University); Xiaochun Cao (Chinese Academy of Sciences)\",Low-level Vision,Datasets and Evaluation; Deep Learning,Poster,1.2,155,L,,,\n1581,Dynamic Scene Deblurring with Parameter Selective Sharing and Nested Skip Connections,Hongyun Gao (The Chinese University of Hong Kong)*; Xin Tao (Tencent); Xiaoyong Shen (Tencent); Jiaya Jia (Chinese University of Hong Kong),Low-level Vision,Deep Learning,Poster,1.2,156,L,,,\n1595,Events-to-Video: Bringing Modern Computer Vision to Event Cameras,\"Henri Rebecq (University of Zurich & ETH Zurich)*; Rene Ranftl (Intel Labs); Vladlen Koltun (Intel Labs); Davide Scaramuzza (University of Zurich & ETH Zurich, Switzerland)\",Low-level Vision,Image and Video Synthesis; Motion and Tracking,Poster,1.2,157,L,,,\n1648,Feedback Network for Image Super-Resolution,\"Zhen Li (Sichuan University); Jinglei Yang (University of California, Santa Barbara); Zheng Liu (University of British Columbia); Xiaomin Yang (Sichuan University)*; Gwanggil Jeon (Incheon National University ); Wei Wu (Sichuan University)\",Low-level Vision,Deep Learning,Poster,1.2,158,L,,,\n1671,Semi-supervised Transfer Learning for Image Rain Removal,Wei Wei (Northwestern University)*; Deyu Meng (Xi'an Jiaotong University); Qian Zhao (Xi'an Jiaotong University ); Zongben Xu (Xi'an Jiaotong University); Ying Wu (Northwestern University),Low-level Vision,Statistical Learning; Vision + Graphics,Poster,1.2,159,L,,,\n1710,EventNet: Asynchronous recursive event processing,Yusuke Sekikawa (DENSO IT Laboratory)*; Kosuke Hara (DENSO IT Laboratory); Hideo Saito (Keio University),Low-level Vision,Computer Vision Theory; Deep Learning ; Motion and Tracking; Video Analytics,Poster,1.2,160,L,,,\n1927,Recurrent Back-Projection Network for Video Super-resolution,Muhammad Haris (TTI-J)*; Greg Shakhnarovich (TTI-Chicago); Norimichi Ukita (TTI-J),Low-level Vision,Deep Learning,Poster,1.2,161,L,,,\n1968,Cascaded Partial Decoder for Fast and Accurate Salient Object Detection,Zhe Wu (University of Chinese Academy of Sciences)*; Li Su (University of Chinese Academy of Sciences); Qingming Huang (University of Chinese Academy of Sciences),Low-level Vision,\"Segmentation, Grouping and Shape\",Poster,1.2,162,L,,,\n2031,A Simple Pooling-Based Design for Real-Time Salient Object Detection,Jiang-Jiang Liu (Nankai University)*; Qibin Hou (Nankai University); Ming-Ming Cheng (Nankai University); Jiashi Feng (NUS); Jianmin Jiang (Shenzhen University),Low-level Vision,Deep Learning,Poster,1.2,163,L,,,\n2128,Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection,\"Jiaxing Zhao (Nankai University); Yang Cao (Nankai University); Deng-Ping Fan (Nankai University); Xuan-Yi Li (Nankai University); Le Zhang (Institute for Infocomm Research，Agency for Science, Technology and Research (A*STAR)); Ming-Ming Cheng (Nankai University)*\",Low-level Vision,RGBD sensors and analytics,Poster,1.2,164,L,,,\n2140,Progressive Image Deraining Networks: Simpler and Better,\"Dongwei Ren (Tianjin University)*; Wangmeng Zuo (Harbin Institute of Technology, China); Qinghua Hu (Tianjin University); Pengfei Zhu (tianjin university); Deyu Meng (Xi'an Jiaotong University)\",Low-level Vision,Computational Photography; Deep Learning,Poster,1.2,165,L,,,\n6592,d-SNE: Domain Adaptation using Stochastic Neighborhood Embedding,Xiang Xu (University of Houston); Xiong Zhou (amazon); Ragav Venkatesan (Amazon)*; Orchid Majumder (Amazon); Guru Swaminathan (Amazon),\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Oral,1.2,166,M,,,\n197,Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation,\"Yawei Luo (University of Technology Sydney)*; Liang Zheng (Australian National University); Tao Guan (Huazhong University of Science and Technology); Junqing Yu (Huazhong University of Science & Technology); Yi Yang (University of Technology, Sydney)\",\"Segmentation, Grouping and Shape\",Deep Learning,Oral,1.2,167,M,,,\n396,ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation,Tuan-Hung VU (Valeo.ai)*; Himalaya Jain (Valeo.ai); Maxime Bucher (Valeo.ai); Matthieu Cord (Sorbonne University); Patrick Pérez (Valeo.ai),Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Oral,1.2,168,M,,,\n325,Local Feature Augmentation with Cross-Modality Context,Zixin Luo (HKUST)*; Tianwei Shen (HKUST); Lei Zhou (HKUST); Jiahui Zhang (Tsinghua University); Yao Yao (The Hong Kong University of Science and Technology); Shiwei Li (HKUST); Tian Fang (HKUST); Long Quan (Hong Kong University of Science and Technology),Representation Learning,Low-level Vision,Oral,1.2,169,M,,,\n556,Large-scale Long-Tailed Recognition in an Open World,Ziwei Liu (The Chinese University of Hong Kong)*; Zhongqi Miao (UC Berkeley); Xiaohang Zhan (The Chinese University of Hong Kong); Jiayun Wang (UC Berkeley / ICSI); Boqing Gong (Tencent AI Lab); Stella X Yu (UC Berkeley / ICSI),Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,1.2,170,M,,,\n5137,AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data,Liheng Zhang (University of Central Florida); Guo-Jun Qi (Huawei Cloud)*; Liqiang Wang (University of Central Florida); Jiebo Luo (University of Rochester),Representation Learning,Deep Learning,Oral,1.2,171,M,,,\n576,SDC - Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks,René Schuster (DFKI)*; Oliver Wasenmüller (DFKI); Christian Unger (BMW); Didier Stricker (DFKI),Representation Learning,Deep Learning ; Motion and Tracking; Robotics + Driving; Scene Analysis and Understanding; Vision Ap,Oral,1.2,172,M,,,\n2746,Learning Correspondence from the Cycle-consistency of Time,Xiaolong Wang (CMU)*; Allan Jabri (UC Berkeley); Alexei A Efros (UC Berkeley),Representation Learning,Motion and Tracking; Video Analytics,Oral,1.2,173,M,,,\n2131,AE^2-Nets: Autoencoder in Autoencoder Networks,Changqing Zhang (Tianjin university)*; liu yeqing (Tianjin University ); Huazhu Fu (Inception Institute of Artificial Intelligence),Representation Learning,Statistical Learning,Oral,1.2,174,M,,,\n1655,Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach,Proteek Roy (Michigan State University); Vishnu Boddeti (Michigan State University)*,Representation Learning,Optimization Methods; Statistical Learning,Oral,1.2,175,M,,,\n3877,Learning Spatial Common Sense with Geometry-Aware Recurrent Networks,Hsiao-Yu Tung (Carnegie Mellon University)*; Ricson Cheng (Carnegie Mellon University); Katerina Fragkiadaki (Carnegie Mellon University),Representation Learning,\"3D from Single Image; Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understa\",Oral,1.2,176,M,,,\n3147,Structured Knowledge Distillation for Semantic Segmentation,\"Yifan Liu (University of Adelaide); Ke Chen (Microsoft); Chris Liu (Microsoft); Zengchang Qin (Intelligent Computing & Machine Learning Lab, School of ASEE, Beihang University); Zhenbo Luo ( Samsung Research Institute China-Beijing); Jingdong Wang (Microsoft Research)*\",\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Oral,1.2,177,M,,,\n977,Scan2CAD: Learning CAD Model Alignment in RGB-D Scans,Armen Avetisyan (Technical University of Munich)*; Manuel Dahnert (Technical University of Munich); Angela Dai (Technical University of Munich); Manolis Savva (Simon Fraser University); Angel X Chang (Eloquent Labs); Matthias Niessner (Technical University of Munich),Scene Analysis and Understanding,\"Recognition: Detection, Categorization, Retrieval; Vision + Graphics\",Oral,1.2,178,M,,,\n2799,Towards Scene Understanding: Unsupervised Monocular Depth Estimation with Semantic-aware Representation,Po-Yi Chen (National Taiwan University); Alexander H. Liu (National Taiwan University); Yen-Cheng Liu (Georgia Institute of Technology); Yu-Chiang Frank Wang (National Taiwan University)*,Scene Analysis and Understanding,\"3D from Single Image; Representation Learning; Robotics + Driving; Segmentation, Grouping and Shape\",Oral,1.2,179,M,,,\n3107,Tell Me Where I Am: Object-level Scene Context Prediction,Xiaotian Qiao (City University of Hong Kong); Quanlong Zheng (City University of HongKong); Ying Cao (City University of Hong Kong)*; Rynson W.H. Lau (City University of Hong Kong),Scene Analysis and Understanding,Image and Video Synthesis,Oral,1.2,180,M,,,\n1373,Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation,He Wang (Stanford University); Srinath Sridhar (Stanford University)*; Jingwei Huang (Stanford University); Julien Valentin (Google); Shuran Song (Princeton); Leonidas Guibas (Stanford University),Scene Analysis and Understanding,3D from Single Image; Datasets and Evaluation; Deep Learning ; RGBD sensors and analytics; Vision Ap,Oral,1.2,181,M,,,\n2452,Supervised Fitting of Geometric Primitives to 3D Point Clouds,Lingxiao Li (Stanford University)*; Minhyuk Sung (Stanford University); Anastasia Dubrovina (Stanford); Li Yi (Stanford); Leonidas Guibas (Stanford University),\"Segmentation, Grouping and Shape\",Deep Learning ; Vision + Graphics,Oral,1.2,182,M,,,\n4225,Do Better ImageNet Models Transfer Better?,Simon Kornblith (Google)*; Jon Shlens (Google); Quoc Le (Google Brain),Representation Learning,Deep Learning,Oral,1.2,183,M,,,\n1371,GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud,Li Yi (Stanford)*; Wang Zhao (Tsinghua University); He Wang (Stanford University); Minhyuk Sung (Stanford University); Leonidas Guibas (Stanford University),Scene Analysis and Understanding,\"Segmentation, Grouping and Shape\",Poster,1.2,184,M,,,\n1423,Attentive Relational Networks for Mapping Images to Scene Graphs,\"Mengshi Qi (Beihang University)*; Weijian Li (University of Rochester); Zhengyuan Yang (University of Rochester); Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China); Jiebo Luo (U. Rochester)\",Scene Analysis and Understanding,Visual Reasoning,Poster,1.2,185,M,,,\n1500,Relational Knowledge Distillation,Wonpyo Park (POSTECH)*; Minsu Cho (POSTECH); Yan Lu (Microsoft Research Asia); Dongju Kim (POSTECH),Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,186,M,,,\n1557,Compressing Convolutional Neural Networks via Factorized Convolutional Filters,Tuanhui Li (Tsinghua University); Baoyuan Wu (Tencent AI Lab)*; Yujiu Yang (Tsinghua Univ.); Yanbo Fan (Tencent AI Lab); Yong Zhang (Tencent AI Lab); Wei Liu (Tencent),Representation Learning,Deep Learning,Poster,1.2,187,M,,,\n1586,DeepMDS: Non-Linear Projection of Deep Representations,Gong Sixue (Michigan State University); Vishnu Boddeti (Michigan State University)*; Anil Jain (Michigan State University),Representation Learning,\"Deep Learning ; Face, Gesture, and Body Pose ; Statistical Learning\",Poster,1.2,188,M,,,\n1593,Part-regularized Near-Duplicate Vehicle Re-identification,Bing He (Beihang University); Jia Li (Beihang University)*; Yifan Zhao (Beihang University); Yonghong Tian (PKU),Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,189,M,,,\n1785,Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics,Jiangliu WANG (CUHK); Jianbo Jiao (University of Oxford); Linchao Bao (Tencent AI Lab)*; Shengfeng He (South China University of Technology); Yunhui Liu (CUHK); Wei Liu (Tencent),Representation Learning,Video Analytics,Poster,1.2,190,M,,,\n1789,Classification-Reconstruction Learning for Open-Set Recognition,Ryota Yoshihashi (The University of Tokyo)*; Wen Shao (The University of Tokyo); Rei Kawakami (The University of Tokyo); Shaodi You (Data61-CSIRO); Makoto Iida (The University of Tokyo); Takeshi Naemura (The University of Tokyo),Representation Learning,Deep Learning,Poster,1.2,191,M,,,\n1867,Emotion-Aware Human Attention Prediction,\"Macario II O Cordel (De La Salle University)*; Shaojing Fan (National University of Singapore); Zhiqi Shen (National University of Singapore); Mohan Kankanhalli (National University of Singapore,)\",Scene Analysis and Understanding,Deep Learning ; Others; Representation Learning; Vision Applications and Systems,Poster,1.2,192,M,,,\n1875,Residual Regression with Semantic Prior for Crowd Counting,\"Jia Wan (City University of Hong Kong)*; Wenhan Luo (Tencent AI Lab); Baoyuan Wu (Tencent AI Lab); Antoni Chan (City University of Hong Kong, Hong, Kong); Wei Liu (Tencent)\",Scene Analysis and Understanding,Vision Applications and Systems,Poster,1.2,193,M,,,\n1881,Context-Reinforced Semantic Segmentation,Yizhou Zhou (University of Science and Technology of China)*; Xiaoyan Sun (Microsoft Research Asia); Zheng-Jun Zha (University of Science and Technology of China); Wenjun Zeng (Microsoft Research),Scene Analysis and Understanding,\"Segmentation, Grouping and Shape\",Poster,1.2,194,M,,,\n1929,Adversarial Structure Matching for Structured Prediction Tasks,Jyh-Jing Hwang (International Computer Science Institute)*; Tsung-Wei Ke (International Computer Science Institute); Jianbo Shi (University of Pennsylvania); Stella X Yu (UC Berkeley / ICSI),Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,1.2,195,M,,,\n1981,Deep Spectral Clustering using Dual Autoencoder Network,Xu Yang (Xidian University); Cheng Deng (Xidian University)*; Feng Zheng (Southern University of Science and Technology); Junchi Yan (Shanghai Jiao Tong University); Wei Liu (Tencent),Representation Learning,Statistical Learning,Poster,1.2,196,M,,,\n1990,Deep Asymmetric Metric Learning via Rich Relationship Mining,Xinyi Xu (Xidian University); Yanhua Yang (Xidian University); Cheng Deng (Xidian University)*; Feng Zheng (Southern University of Science and Technology),Representation Learning,\"Big Data, Large Scale Methods\",Poster,1.2,197,M,,,\n2011,Did it change? Learning to Detect Point-of-Interest Changes for Proactive Map Updates,Jerome Revaud (Naver Labs Europe)*; Minhyeok Heo (Naver LABS); Rafael S Rezende (Naver Labs); Chanmi You (Naver Labs); Seong-Gyun Jeong (Naver Labs),Representation Learning,\"Datasets and Evaluation; Deep Learning ; Recognition: Detection, Categorization, Retrieval; Scene An\",Poster,1.2,198,M,,,\n2017,Associatively Segmenting Instances and Semantics in Point Clouds,Xinlong Wang (Tongji University)*; Shu Liu (Tencent); Xiaoyong Shen (Tencent); Chunhua Shen (University of Adelaide); Jiaya Jia (Chinese University of Hong Kong),Scene Analysis and Understanding,\"3D from Multiview and Sensors; Deep Learning ; Segmentation, Grouping and Shape\",Poster,1.2,199,M,,,\n2027,\"Pattern-Affinitive Propagation across Depth, Surface Normal and Semantic Segmentation\",Zhenyu Zhang (Nanjing University of Sci & Tech)*; Zhen Cui (Nanjing University of Science and Technology); Chunyan Xu (Nanjing University of Science and Technology); Yan Yan (Nanjing University of Science and Technology); Nicu Sebe (University of Trento); Jian Yang (Nanjing University of Science and Technology),Scene Analysis and Understanding,\"3D from Single Image; Segmentation, Grouping and Shape\",Poster,1.2,200,M,,,\n2048,Scene Categorization from Contours: Medial Axis Based Salience Measures,Morteza Rezanejad (Mcgill university )*; Gabriel Downs (McGill University); John Wilder (University of Toronto); Dirk Bernhardt-Walther (University of Toronto); Sven Dickinson (University of Toronto); Allan Jepson (Samsung); Kaleem Siddiqi (McGill University),Scene Analysis and Understanding,\"Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Poster,1.2,201,M,,,\n473,Unsupervised Image Captioning,Yang Feng (University of Rochester)*; Lin Ma (Tencent AI Lab); Wei Liu (Tencent); Jiebo Luo (U. Rochester),Vision + Language,,Poster,1.2,202,N,,,\n1560,Adversarial Attack to Image Captioning via Structured Output Learning with Latent Variables,Yan Xu (UESTC); Baoyuan Wu (Tencent AI Lab)*; Fumin Shen (UESTC); Yanbo Fan (Tencent AI Lab); Yong Zhang (Tencent AI Lab); Heng Tao Shen (University of Electronic Science and Technology of China (UESTC)); Wei Liu (Tencent),Vision + Language,,Poster,1.2,203,N,,,\n1735,Cross-Modal Relationship Inference for Grounding Referring Expressions,Sibei Yang (The University of Hong Kong); Guanbin Li (Sun Yat-sen University)*; Yizhou Yu (Deepwise AI Lab),Vision + Language,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,204,N,,,\n1771,What's to know? Uncertainty as a Guide to Asking Goal-oriented Questions,Ehsan M Abbasnejad (Adelaide)*; Qi Wu (University of Adelaide); Qinfeng Shi (University of Adelaide); Anton van den Hengel (University of Adelaide),Vision + Language,Deep Learning,Poster,1.2,205,N,,,\n1777,Iterative Alignment Network for Continuous Sign Language Recognition,Junfu Pu (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China),Vision + Language,\"Face, Gesture, and Body Pose\",Poster,1.2,206,N,,,\n1903,Neural Sequential Phrase Grounding (SeqGROUND),Pelin Dogan (ETH Zurich)*; Leonid Sigal (University of British Columbia); Markus Gross (ETH Zurich),Vision + Language,\"Recognition: Detection, Categorization, Retrieval; Visual Reasoning\",Poster,1.2,207,N,,,\n1938,CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions,Runtao Liu (Peking University); Chenxi Liu (Johns Hopkins University)*; Yutong Bai (Northwestern Polytechnical University); Alan Yuille (Johns Hopkins University),Vision + Language,Datasets and Evaluation; Deep Learning ; Visual Reasoning,Poster,1.2,208,N,,,\n1964,Describing like Humans: on Diversity in Image Captioning,\"Qingzhong Wang (Department of Computer Science, City University of Hong Kong)*; Antoni Chan (City University of Hong Kong, Hong, Kong)\",Vision + Language,,Poster,1.2,209,N,,,\n2078,MSCap: Multi-Style Image Captioning with Unpaired Stylized Text,\"Longteng Guo ( Institute of Automation, Chinese Academy of Sciences)*; Jing Liu (National Lab of Pattern Recognition, Institute of Automation,Chinese Academy of Sciences); Peng Yao (University of Science and Technology Beijing); Jiangwei Li (Huawei); Hanqing Lu (NLPR, Institute of Automation, CAS)\",Vision + Language,,Poster,1.2,210,N,,,\n1381,Towards Accurate Task Accomplishment with Low-Cost Robotic Arms,Yiming Zuo (Tsinghua University)*; Weichao Qiu (Johns Hopkins University); Lingxi Xie (Johns Hopkins University); Fangwei Zhong (Peking University); Yizhou Wang (PKU); Alan Yuille (Johns Hopkins University),Vision Applications and Systems,3D from Single Image; Robotics + Driving,Poster,1.2,211,O,,,\n1461,Networks for Joint Affine and Non-parametric Image Registration,Zhengyang Shen (UNC)*; Xu Han (UNC Chapel Hill); Zhenlin Xu (UNC Chapel Hill); Marc Niethammer (UNC),\"Medical, Biological and Cell Microscopy\",Deep Learning ; Optimization Methods,Poster,1.2,212,O,,,\n1502,Learning Shape-Aware Embedding for Scene Text Detection,Zhuotao Tian (Chinese University of Hong Kong)*; Michelle Shu (the Johns Hopkins University); Pengyuan Lyu (Tencent); Ruiyu Li (Tencent); Chao Zhou (Tencent); Xiaoyong Shen (Tencent); Jiaya Jia (Chinese University of Hong Kong),Vision Applications and Systems,\"Recognition: Detection, Categorization, Retrieval\",Poster,1.2,213,O,,,\n1687,Learning to Film from Professional Human Motion Videos,Chong Huang (UC Santa Barbara); Chuan-en Lin (Hong Kong University of Science and Technology); Zhenyu Yang (UC Santa Barbara); Yan Kong (UC Santa Barbara); Peng Chen (Zhejiang University of Technology); Xin Yang (Huazhong University of Science and Technology)*; Kwang-Ting Cheng (Hong Kong University of Science and Technology),Vision Applications and Systems,Deep Learning ; Robotics + Driving; Scene Analysis and Understanding,Poster,1.2,214,O,,,\n1920,Pay attention! - Robustifying a Deep Visuomotor Policy through Task-Focused Visual Attention,Pooya Abolghasemi (University of Central Florida)*; Amir Mazaheri (University of Central Florida); Ladislau Boloni (University of Central Florida); Mubarak Shah (University of Central Florida),Robotics + Driving,Vision Applications and Systems,Poster,1.2,215,O,,,\n2089,Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence,Dahun Kim (KAIST)*; Sanghyun Woo (KAIST); Joon-Young Lee (Adobe Research); In So Kweon (KAIST),Vision Applications and Systems,Low-level Vision; Video Analytics,Poster,1.2,216,O,,,\n1213,Learning Video Representations from Correspondence Proposals,Xingyu Liu (Stanford University)*; Joon-Young Lee (Adobe Research); Hailin Jin (Adobe Research),Deep Learning,Action Recognition ; Representation Learning,Oral,2.1,1,A,,,\n1503,SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks,\"Bo Li (SenseTime Group Limited)*; Wei Wu (SenseTime Group Limited); Junjie Yan (Sensetime Group Limited); Qiang Wang (University of Chinese Academy of Sciences); Fangyi Zhang (Institue of Computing Technology); Junliang Xing (Institute of Automation, Chinese Academy of Sciences)\",Deep Learning,Motion and Tracking,Oral,2.1,2,A,,,\n2556,Sphere Generative Adversarial Network Based on Geometric Moment Matching,\"Sung Woo Park (Chung-Ang Univ., Korea); Junseok Kwon (Chung-Ang Univ., Korea)*\",Deep Learning,Image and Video Synthesis,Oral,2.1,3,A,,,\n1431,Adversarial Attacks Beyond the Image Space,xiaohui zeng (toronto); Chenxi Liu (Johns Hopkins University)*; Yu-Siang Wang (National Taiwan University); Weichao Qiu (Johns Hopkins University); Lingxi Xie (Johns Hopkins University); Yu-Wing Tai (Tencent); Chi-Keung Tang (Hong Kong University of Science and Technology); Alan Yuille (Johns Hopkins University),Deep Learning,Vision + Graphics,Oral,2.1,4,A,,,\n5297,Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks,Yinpeng Dong (Tsinghua University)*; Tianyu Pang (Tsinghua University); Hang Su (Tsinghua Univiersity); Jun Zhu (Tsinghua University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,2.1,5,A,,,\n6129,Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses,Jérôme Rony (ÉTS Montréal)*; Luiz Gustavo Hafemann (ÉTS Montréal); Luis Eduardo Oliveira (UFPR); Ismail Ben Ayed (ETS Montreal); Robert Sabourin (Canada); Eric Granger (ETS Montreal ),Deep Learning,Others; Vision Applications and Systems,Oral,2.1,6,A,,,\n1472,A General and Adaptive Robust Loss Function,Jonathan T Barron (Google Research)*,Deep Learning,3D from Single Image; Computer Vision Theory; Image and Video Synthesis; Low-level Vision; Statistic,Oral,2.1,7,A,,,\n2677,Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration,\"Yang He (University of Technology Sydney)*; Ping Liu (UTS); Ziwei Wang (Information Science Academy, CETC); Zhilan Hu (Huawei); Yi Yang (University of Technology, Sydney)\",Deep Learning,Others,Oral,2.1,8,A,,,\n4595,Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss,Sangil Jung (Samsung)*; Changyong Son (Samsung); Seohyung Lee (Samsung); Jinwoo Son (Samsung); Jae-Joon Han (Samsung); Youngjun Kwak (Samsung); Sung Ju Hwang (KAIST); Changkyu Choi (Samsung),Deep Learning,Optimization Methods; Vision Applications and Systems,Oral,2.1,9,A,,,\n1773,Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection,Ruoqi Sun (Shanghai Jiao Tong University)*; Xinge Zhu (The Chinese University of Hong Kong); Chongruo Wu (UC Davis); Chen Huang (Carnegie Mellon University); Jianping Shi (Sensetime Group Limited); Lizhuang Ma (Shanghai Jiao Tong University),Deep Learning,\"Low-level Vision; Segmentation, Grouping and Shape\",Oral,2.1,10,A,,,\n3740,Unsupervised Learning of Dense Shape Correspondence,\"Oshri Halimi (Technion)*; Or Litany (Facebook AI Research); Emanuele Rodola (Sapienza University of Rome); Alex Bronstein (Tel Aviv University, Israel); Ron Kimmel (Technion)\",Deep Learning,\"Computer Vision Theory; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping an\",Oral,2.1,11,A,,,\n6188,RePr: Improved Training of Convolutional Filters,Aaditya Prakash (Brandeis University)*; James Storer (Brandeis University); Dinei Florencio (Microsoft Research); Cha Zhang (Microsoft Research),Deep Learning,,Oral,2.1,12,A,,,\n2726,Balanced Self-Paced Learning for Generative Adversarial Clustering Network,Kamran Ghasedi (University of Pittsburgh)*; Xiaoqian Wang (University of Pittsburgh); Cheng Deng (Xidian University); Heng Huang (University of Pittsburgh),Deep Learning,,Oral,2.1,13,A,,,\n2860,A Style-Based Generator Architecture for Generative Adversarial Networks,Tero Karras (NVIDIA Research)*; Samuli Laine (NVIDIA Research); Timo Aila (NVIDIA Research),Deep Learning,Image and Video Synthesis; Representation Learning,Oral,2.1,14,A,,,\n5426,Parallel Optimal Transport GAN,Gil Avraham (Monash University)*; Yan Zuo (Monash University); Tom Drummond (Monash University),Deep Learning,Optimization Methods; Representation Learning,Oral,2.1,15,A,,,\n2292,Reversible GANs for Memory-efficient Image-to-Image Translation,Tycho van der Ouderaa (University of Amsterdam)*; Daniel E Worrall (University of Amsterdam),Deep Learning,\"Image and Video Synthesis; Others; Segmentation, Grouping and Shape\",Poster,2.1,16,A,,,\n2306,Sensitive-Sample Fingerprinting of Deep Neural Networks,Zecheng He (Princeton University)*; Tianwei Zhang (Princeton University); Ruby Lee (Princeton University),Deep Learning,Others,Poster,2.1,17,A,,,\n2320,Soft Labels for Ordinal Regression,Raul Diaz (HP Inc)*; Amit Marathe (HP Inc),Deep Learning,\"3D from Single Image; Recognition: Detection, Categorization, Retrieval; Robotics + Driving; Scene A\",Poster,2.1,18,A,,,\n2377,Local to Global Learning for Deep Neural Networks,Hao Cheng (Shanghaitech University)*; Dongze Lian (Shanghaitech University); Bowen Deng (Shanghaitech University); Shenghua Gao (Shanghaitech University); Tao Tan (Eindhoven University of Technology); Yanlin Geng (Xidian University),Deep Learning,Computer Vision Theory; Representation Learning; Statistical Learning,Poster,2.1,19,A,,,\n2447,\"What does it mean to learn in deep networks? And, how does one detect adversarial attacks?\",\"Ciprian Corneanu (Universitat de Barcelona)*; Aleix M Martinez (OSU); Sergio Escalera (Computer Vision Center (UAB) & University of Barcelona,); Meysam Madadi (CVC)\",Deep Learning,Optimization Methods; Representation Learning,Poster,2.1,20,A,,,\n2459,Handwriting Recognition in Low-resource Scripts using Adversarial Learning,Ayan Kumar Bhunia (Nanyang Technological University); Abhirup Das (Institute of Engineering and Management); Ankan Kumar Bhunia (Jadavpur University); Sai Raj Kishore Perla (Institute of Engineering & Management); Partha Pratim Roy (IIT Roorkee)*,Deep Learning,Document Analysis,Poster,2.1,21,A,,,\n2466,Adversarial Defense Through Network Profiling Based Path Extraction,Yuxian Qiu (Shanghai Jiao Tong University)*; Jingwen Leng (Shanghai Jiao Tong University); Yuhao Zhu (University of Rochester); Cong Guo (Shanghai Jiao Tong University); Quan Chen (Shanghai Jiao Tong University); Chao Li (Shanghai Jiaotong University); Minyi Guo (Shanghai Jiaotong University),Deep Learning,Representation Learning; Vision Applications and Systems,Poster,2.1,22,A,,,\n2494,RENAS: Reinforced Evolutionary Neural Architecture Search,\"Yukang Chen (Institute of Automation, Chinese Academy of Sciences); Gaofeng Meng (Chinese Academy of Sciences); Qian Zhang (Horizon Robotics); SHIMING XIANG (Chinese Academy of Sciences, China); Chang Huang (Horizon Robotics); Lisen Mu (Horizon Robotics LTD); Xinggang Wang (Huazhong Univ. of Science and Technology)*\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,23,A,,,\n2537,Co-Occurrence Neural Network,Irina Shevlev (Tel Aviv University)*; Shai Avidan (Tel Aviv University),Deep Learning,,Poster,2.1,24,A,,,\n2557,SpotTune: Transfer Learning through Adaptive Fine-tuning,\"Yunhui Guo (University of California, San Diego)*; Honghui Shi (IBM | UIUC); Abhishek Kumar (Google); Kristen Grauman (Facebook AI Research & UT Austin); Tajana Rosing (University of California, San Diego); Rogerio Feris (IBM Research AI, MIT-IBM Watson AI Lab)\",Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,25,A,,,\n2562,Signal-to-Noise Ratio: A Robust Distance Metric for Deep Metric Learning,Tongtong Yuan (Beijing University of Posts and Telecommunications)*; Jian Tang (Syracuse University); Binghui Chen (BUPT); Yinan Tang (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,2.1,26,A,,,\n2888,Detection based Defense against Adversarial Examples from the Steganalysis Point of View,Jiayang Liu (University of Science and Technology of China); Weiming Zhang (University of Science and Technology of China)*; Yiwei Zhang (University of Science and Technology of China); Dongdong Hou (University of Science and Technology of China); Yujia Liu (University of Science and Technology of China); Hongyue Zha (University of Science and Technology of China); Nenghai Yu (University of Science and Technology of China),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,27,A,,,\n2927,HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs,Pravendra Singh (Indian Institute of Technology Kanpur); Vinay Kumar Verma (Indian Institute of Technology Kanpur); Piyush Rai (IIT Kanpur); Vinay P Namboodiri (IIT Kanpur)*,Deep Learning,\"Computer Vision Theory; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,28,A,,,\n2951,Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects,Michael A Alcorn (Auburn University); Qi Li (Auburn University); Zhitao Gong (Auburn University); Chengfei Wang (Auburn University); Long T Mai (Adobe Research); Wei-Shinn Ku (Auburn University); Anh Nguyen (Auburn University)*,Deep Learning,\"Datasets and Evaluation; Optimization Methods; Recognition: Detection, Categorization, Retrieval; Sc\",Poster,2.1,29,A,,,\n2996,Blind Geometric Distortion Correction on Images Through Deep Learning,Xiaoyu LI (Hong Kong University of Science and Technology)*; Bo Zhang (Hong Kong University of Science and Technology); Pedro Sander (HKUST); Jing Liao (City University of Hong Kong),Deep Learning,\"Computational Photography; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,30,A,,,\n3013,Instance-Level Meta Normalization,Songhao Jia (National Tsing Hua University); Ding-Jie Chen (Academia Sinica); Hwann-Tzong Chen (National Tsing Hua University)*,Deep Learning,Others,Poster,2.1,31,A,,,\n3025,Iterative Normalization: Beyond Standardization towards Efficient Whitening,Lei Huang (the inception institute of artificial intelligence)*; Yi Zhou (Inception Institute of Artificial Intelligence); Fan Zhu (Inception Institute of Artificial Intelligence); Li Liu (the inception institute of artificial intelligence); Ling Shao (Inception Institute of Artificial Intelligence),Deep Learning,,Poster,2.1,32,A,,,\n3042,Density Aware Deep Metric Learning,Soumyadeep Ghosh (IIIT Delhi); Richa Singh (IIIT-Delhi); Mayank Vatsa (IIIT-Delhi)*,Deep Learning,,Poster,2.1,33,A,,,\n3083,Contrastive Adaptation Network for Unsupervised Domain Adaptation,Guoliang Kang (UTS)*; Lu Jiang (Google); Yi Yang (UTS); Alexander Hauptmann (Carnegie Mellon University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,2.1,34,A,,,\n3110,LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks,Sudhakar Kumawat (Indian Institute of Technology Gandhinagar)*; Shanmuganathan Raman (Indian Institute of Technology (IIT) Gandhinagar),Deep Learning,Scene Analysis and Understanding; Video Analytics; Vision Applications and Systems,Poster,2.1,35,A,,,\n3230,Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification,\"Yiru Zhao (Shanghai Jiao Tong University)*; Xu Shen (Alibaba Group); Zhongming Jin (Alibaba Group); Hongtao Lu (Shanghai Jiao Tong University); Xiansheng Hua (Damo Academy, Alibaba Group)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,2.1,36,A,,,\n3255,Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?,Shilin Zhu (UCSD)*; Xin Dong (Harvard Univeristy); Hao Su (UCSD),Deep Learning,\"Big Data, Large Scale Methods ; Recognition: Detection, Categorization, Retrieval; Statistical Learn\",Poster,2.1,37,A,,,\n3287,Distilling Object Detectors with Fine-grained Feature Imitation,\"tao wang (national university of singapore)*; Jiashi Feng (NUS); Li Yuan (National University of Singapore); Xiaopeng Zhang (Noah's Ark Lab, Huawei Inc.)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,2.1,38,A,,,\n3323,Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure,\"Xiaohan Ding (Tsinghua University)*; guiguang ding (Tsinghua University, China); Yuchen Guo (Tsinghua University); Jungong Han (Lancaster University)\",Deep Learning,,Poster,2.1,39,A,,,\n3324,Knockoff Nets: Stealing Functionality of Black-Box Models,Tribhuvanesh Orekondy (Max Planck Institute for Informatics)*; Bernt Schiele (MPI Informatics); Mario Fritz (CISPA Helmholtz Center for Information Security),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,2.1,40,A,,,\n392,Deep Embedding Learning with Discriminative Sampling Policy,Yueqi Duan (Tsinghua University); Lei Chen (Tianjin University); Jiwen Lu (Tsinghua University)*; Jie Zhou (Tsinghua University),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.1,41,B,,,\n1714,Hybrid Task Cascade for Instance Segmentation,Kai Chen (The Chinese University of Hong Kong)*; Jiangmiao Pang (Zhejiang University); Jiaqi Wang (CUHK); Yu Xiong (The Chinese University of HK); Xiaoxiao Li (The Chinese University of Hong Kong); Shuyang Sun (The University of Sydney); Wansen Feng (Lille university ); Ziwei Liu (The Chinese University of Hong Kong); Jianping Shi (Sensetime Group Limited); Wanli Ouyang (The University of Sydney); Chen Change Loy (Nanyang Technological University); Dahua Lin (The Chinese University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.1,42,B,,,\n2219,Multi-task Self-Supervised Object Detection via Recycling of Bounding Box Annotations,Wonhee Lee (Seoul National University)*; Joonil Na (Seoul National University); Gunhee Kim (Seoul National University),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.1,43,B,,,\n2293,ClusterNet: Deep Hierarchical Cluster Network with Rigorously Rotation-Invariant Representation for Point Cloud Recognition,Chao Chen (Sun Yat-sen University); Guanbin Li (Sun Yat-sen University)*; Ruijia Xu (Sun Yat-sen University); Tianshui Chen (Sun Yat-Sen University); Meng Wang (Hefei University of Technology); Liang Lin (Sun Yat-sen University),\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Poster,2.1,44,B,,,\n2305,Learning to Learn Relation for Important People Detection in Still Images,\"Wei-Hong Li (University of Edinburgh); Fa-Ting Hong (Sun Yat-Sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)*\",\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose ; Representation Learning\",Poster,2.1,45,B,,,\n2330,Looking for the devil in the details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition,Heliang Zheng (USTC)*; Jianlong Fu (Microsoft Research); Zheng-Jun Zha (University of Science and Technology of China); Jiebo Luo (U. Rochester),\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Poster,2.1,46,B,,,\n2333,Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning,Xun Wang (Malong Technologies)*; Xintong Han (Malong Technologies); Weilin Huang (Malong Technologies); Dengke Dong (Malong Technologies); Matthew R Scott (Malong Technologies),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,2.1,47,B,,,\n2454,Domain-Symmetric Networks for Adversarial Domain Adaptation,Yabin Zhang (South China University of Technology); Hui Tang (South China University of Technology); Kui Jia (South China University of Technology)*; Mingkui Tan (South China University of Technology),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.1,48,B,,,\n2506,End-to-End Supervised Product Quantization for Image Search and Retrieval,\"Benjamin Klein (Tel Aviv University)*; Lior Wolf (Tel Aviv University, Israel)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.1,49,B,,,\n2512,Learning to Learn from Noisy Labeled Data,\"Junnan Li (National University of Singapore)*; Wong Yongkang (National University of Singapore); Qi Zhao (University of Minnesota); Mohan Kankanhalli (National University of Singapore,)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.1,50,B,,,\n2527,DSFD:Dual Shot Face Detector,Jian Li (Nanjing University of Science and Technology)*; Yabiao Wang (Tencent); Changan Wang (Huazhong University of Science and Technology); Ying Tai (Tencent YouTu); Jianjun Qian (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology); Chengjie Wang (Tencent); Jilin Li (Tencent); Feiyue Huang (Tencent),\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose\",Poster,2.1,51,B,,,\n2572,Label propagation for Deep Semi-supervised Learning,\"Ahmet Iscen (Czech Technical University)*; Giorgos Tolias (Vision Recognition Group, Czech Technical University in Prague); Yannis Avrithis (Inria); Ondrej Chum (Vision Recognition Group, Czech Technical University in Prague)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.1,52,B,,,\n2649,Deep Global Generalized Gaussian Networks,\"Qilong Wang (Tianjin University)*; Peihua Li (Dalian University of Technology); Qinghua Hu (Tianjin University); Pengfei Zhu (tianjin university); Wangmeng Zuo (Harbin Institute of Technology, China)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Statistical Learning,Poster,2.1,53,B,,,\n2659,Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval,Anjan Dutta (Computer Vision Center)*; Zeynep Akata (University of Amsterdam),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Vision + Graphics ; Vision Applications and Systems,Poster,2.1,54,B,,,\n2674,Context-Aware Crowd Counting,\"Weizhe Liu (EPFL)*; Mathieu Salzmann (EPFL); Pascal Fua (EPFL, Switzerland)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Vision Applications and Systems; Visual Reasoning,Poster,2.1,55,B,,,\n2758,Detect-to-Retrieve: Efficient Regional Aggregation for Image Search,Andre Araujo (Google)*; Marvin Teichmann (University of Cambridge); Menglong Zhu (); Jack Sim (Google LLC),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.1,56,B,,,\n2818,Towards Accurate One-Stage Object Detection with AP-Loss,Kean Chen (Shanghai Jiao Tong University)*; Weiyao Lin (Shanghai Jiao Tong university); Jianguo Li (Intel Labs); John See (Multimedia University); Ji Wang (Tencent); Lingyu Duan (Peking University); Zhibo Chen (Tencent); Changwei He (Tencent); Junni Zou (Shanghai Jiao Tong University),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.1,57,B,,,\n2856,On Exploring Indeterminate Relationships for Visual Relationship Detection,Yibing Zhan (Hangzhou Dianzi University); Jun Yu (HDU)*; Ting Yu (Hangzhou Dianzi University); Dacheng Tao (University of Sydney),\"Recognition: Detection, Categorization, Retrieval\",Scene Analysis and Understanding,Poster,2.1,58,B,,,\n2905,Learning without Memorizing,\"Prithviraj Dhar (University of Maryland, College Park)*; Rajat Vikram Singh (Siemens Corporation); Kuan-Chuan Peng (Siemens Corporation); Ziyan Wu (Siemens Corporation); Rama Chellappa (University of Maryland)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Others; Vision Applications and Systems,Poster,2.1,59,B,,,\n2980,Dynamic Recursive Neural Network,Qiushan Guo (Beijing University of Posts and Telecommunications)*; Zhipeng Yu (Sensetime Group Limited); Yichao Wu (Sensetime Group Limited); Ding Liang (Sensetime Group Limited); Haoyu Qin (Sensetime); Junjie Yan (Sensetime Group Limited),\"Recognition: Detection, Categorization, Retrieval\",Computer Vision Theory,Poster,2.1,60,B,,,\n2992,Destruction and Construction Learning for Fine-grained Image Recognition,Yue Chen (JD AI Research)*; Yalong Bai (JD AI Research); Wei Zhang (JD AI Research); Tao Mei (AI Research of JD.com),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.1,61,B,,,\n3109,Distraction-aware Shadow Detection,Quanlong Zheng (City University of HongKong); Xiaotian Qiao (City University of Hong Kong); Ying Cao (City University of Hong Kong)*; Rynson W.H. Lau (City University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",Others,Poster,2.1,62,B,,,\n3140,Multi-Label Image Recognition with Graph Convolutional Networks,Zhao-Min Chen (NanJing University); Xiu-Shen Wei (Nanjing University)*; Peng Wang (The University of Adelaide); Yanwen Guo (-),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.1,63,B,,,\n3171,High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection,Wei Liu (National University of Defense Technology); Shengcai Liao (Inception Institute of Artificial Intelligence)*; Weiqiang Ren (Horizon Robotics); Weidong Hu (National University of Defence Technology); Yinan Yu (Horizon Robotics),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.1,64,B,,,\n3199,RepMet: Representative-based metric learning for classification and few-shot object detection,\"Leonid Karlinsky (IBM-Research)*; Joseph Shtok (IBM-Reseach); Sivan Harary (IBM-Research); Eli Schwartz (Tel-Aviv University); Amit Aides (IBM); Rogerio Feris (IBM Research AI, MIT-IBM Watson AI Lab); Raja Giryes (Tel Aviv University); Alex Bronstein (Technion)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,2.1,65,B,,,\n3211,Ranked List Loss for Deep Metric Learning,Xinshao Wang (Queen's University Belfast); Yang Hua (Queen's University Belfast)*; Elyor Kodirov (Anyvision); Guosheng Hu (Anyvision); Romain Garnier (Anyvision); Neil Robertson (Queen's University Belfast),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,2.1,66,B,,,\n3315,CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning,Chi Zhang (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Fayao Liu (University of Adelaide); Rui Yao (China University of Mining and Technology); Chunhua Shen (University of Adelaide),\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Poster,2.1,67,B,,,\n5953,Precise Detection in Densely Packed Scenes,Eran Goldman (Bar Ilan University); Roei Herzig (Tel Aviv University); Aviv Eisenschtat (Trax); Jacob Goldberger (BIU); Tal Hassner (Open University of Israel)*,\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.1,68,B,,,\n2188,KE-GAN: Knowledge Embedded Generative Adversarial Networks for Semi-Supervised Scene Parsing,\"Mengshi Qi (Beihang University)*; Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China); Jie Qin (Inception Institute of Artificial Intelligence); Annan Li (Beijing University of Aeronautics and Astronautics)\",\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Poster,2.1,69,C,,,\n2338,Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks,Seoung Wug Oh (Yonsei Univeristy); Joon-Young Lee (Adobe Research); Ning Xu (Adobe Research); Seon Joo Kim (Yonsei Univ.)*,\"Segmentation, Grouping and Shape\",Deep Learning ; Video Analytics,Poster,2.1,70,C,,,\n2589,Fast Interactive Object Annotation with Curve-GCN,Huan Ling (University of Toronto); Jun Gao (University of Toronto); Amlan Kar (University of Toronto); Wenzheng Chen (University of Toronto); Sanja Fidler (University of Toronto)*,\"Segmentation, Grouping and Shape\",Deep Learning,Poster,2.1,71,C,,,\n2658,FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference,Jungbeom Lee (Seoul National University); Eunji Kim (Seoul National University); Sungmin Lee (Seoul National University); Jangho Lee (Seoul National University); Sungroh Yoon (Seoul National University)*,\"Segmentation, Grouping and Shape\",Deep Learning,Poster,2.1,72,C,,,\n2722,End-to-End Recurrent Net for Video Object Segmentation,Carles Ventura (Universitat Oberta de Catalunya)*; Míriam Bellver (Barcelona Supercomputing Center); Andreu Girbau (Universitat Politècnica de Catalunya); Amaia Salvador (Universitat Politècnica de Catalunya); Ferran Marques (Universitat Politecnica de Catalunya); Xavier Giro-i-Nieto (Universitat Politecnica de Catalunya),\"Segmentation, Grouping and Shape\",Deep Learning ; Motion and Tracking; Video Analytics,Poster,2.1,73,C,,,\n2878,DeepFlux for Skeletons in the Wild,Yukang Wang (Huazhong University of Science and Technology); Yongchao Xu (Huazhong University of Science and Technology)*; Xiang Bai (Huazhong University of Science and Technology); Stavros Tsogkas (University of Toronto); Sven Dickinson (University of Toronto); Kaleem Siddiqi (McGill University),\"Segmentation, Grouping and Shape\",,Poster,2.1,74,C,,,\n3244,Interactive Image Segmentation via Backpropagating Refinement Scheme,Won-Dong Jang (Harvard University)*; Chang-Su Kim (Korea university),\"Segmentation, Grouping and Shape\",,Poster,2.1,75,C,,,\n3253,Scene Parsing via Integrated Classification Model and Variance-Based Regularization,Hengcan Shi ( University of Electronic Science and Technology of China)*; Hongliang Li (University of Electronic Science and Technology of China); Qingbo Wu (University of Electronic Science and Technology of China); Zichen Song (University of Electronic Science and Technology of China),\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Poster,2.1,76,C,,,\n2208,RAVEN: A Dataset for Relational and Analogical Visual rEasoNing,\"Chi Zhang (University of California, Los Angeles)*; Feng Gao (UCLA); Baoxiong Jia (UCLA); Yixin Zhu (UCLA); Song-Chun Zhu (UCLA)\",Datasets and Evaluation,Visual Reasoning,Poster,2.1,77,D,,,\n2212,Surface Reconstruction from Normals: A Robust DGP-based Discontinuity Preservation Approach,Wuyuan Xie (Shenzhen University); Miaohui Wang (Shenzhen University)*; Jing Qin (The Hong Kong Polytechnic University); Mingqiang Wei (Nanjing University of Aeronautics and Astronautics); Jianmin Jiang (Shenzhen University),Physics-based Vision and Shape-from-X,3D from Single Image,Poster,2.1,78,D,,,\n2273,\"DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Retrieval of Clothing Images\",\"Yuying Ge (The Chinese University of Hong Kong)*; Ruimao Zhang (The Chinese University of Hong Kong); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Xiaoou Tang (The Chinese University of Hong Kong); Ping Luo (The Chinese University of Hong Kong)\",Datasets and Evaluation,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,79,D,,,\n2278,Jumping Manifolds: Geometry Aware Dense Non-Rigid Structure from Motion,Suryansh Kumar (ANU (Australian National University))*,Physics-based Vision and Shape-from-X,3D from Multiview and Sensors,Poster,2.1,80,D,,,\n2328,LVIS: A Dataset for Large Vocabulary Instance Segmentation,Agrim Gupta (FAIR); Piotr Dollar (FAIR); Ross Girshick (FAIR)*,Datasets and Evaluation,\"Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Poster,2.1,81,D,,,\n2392,Fast Object Class Labelling via Speech,Michael Gygli (Google)*; Vittorio Ferrari (Google Research),Datasets and Evaluation,,Poster,2.1,82,D,,,\n2421,LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking,Heng Fan (Temple University); Liting Lin (South China University of Technology); Fan Yang (Temple University); Peng Chu (Temple University); Ge Deng (Temple University); Sijia Yu (Temple University); Hexin Bai (Temple University); Yong Xu (South China University of Technology); Chunyuan Liao (Hiscene Technology); Haibin Ling (Temple University)*,Datasets and Evaluation,Deep Learning ; Motion and Tracking; Representation Learning; Vision Applications and Systems,Poster,2.1,83,D,,,\n2596,Creative Flow+ Dataset,Maria Shugrina (University of Toronto)*; Ziheng Liang (University of British Columbia); Amlan Kar (University of Toronto); Jiaman Li (University of Toronto); Angad Singh (Evertz Microsystems); Karan Singh (University of Toronto); Sanja Fidler (University of Toronto),Datasets and Evaluation,3D from Single Image; Deep Learning ; Motion and Tracking; Vision + Graphics,Poster,2.1,84,D,,,\n2693,Weakly Supervised Open-set Domain Adaptation by Dual-domain Collaboration,\"Shuhan Tan (Sun-Yat-Sen University, China)*; Jiening Jiao (Sun Yat-sen University, China ); WEI-SHI ZHENG (Sun Yat-sen University, China)\",Statistical Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,85,D,,,\n2729,A Neurobiological Evaluation Metric for Neural Network Model Search,Nathaniel Blanchard (University of Notre Dame)*; Jeffery D Kinnison (University of Notre Dame); Brandon RichardWebster (University of Notre Dame); Pouya Bashivan (Massachusetts Institute of Technology); Walter Scheirer (University of Notre Dame),Datasets and Evaluation,\"Deep Learning ; Low-level Vision; Recognition: Detection, Categorization, Retrieval; Representation\",Poster,2.1,86,D,,,\n2944,Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision,Alireza Zaeemzadeh (University of Central Florida)*; Mohsen Joneidi (University of Central Florida); Nazanin Rahnavard (University of Central Florida); Mubarak Shah (University of Central Florida),\"Big Data, Large Scale Methods\",Optimization Methods,Poster,2.1,87,D,,,\n2971,Efficient Multi-Domain Learning by Covariance Normalization,Yunsheng Li (UCSD)*; Nuno Vasconcelos (UC San Diego),Statistical Learning,Deep Learning,Poster,2.1,88,D,,,\n3056,Predicting visible image differences under varying display brightness and viewing distance,Nanyang Ye (University of Cambridge)*; Krzysztof Wolski (Max Planck Institut für Informatik); Rafal Mantiuk (University of Cambridge),Datasets and Evaluation,Deep Learning,Poster,2.1,89,D,,,\n3060,A Bayesian Perspective on the Deep Image Prior,\"zezhou cheng (university of massachusetts amherst)*; Matheus A Gadelha (University of Massachusetts Amherst); Subhransu Maji (University of Massachusetts, Amherst); Daniel Sheldon (University of Massachusetts, Amherst)\",Statistical Learning,Deep Learning ; Low-level Vision; Optimization Methods,Poster,2.1,90,D,,,\n3193,ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving,\"Xibin Song (Baidu); Peng Wang (Baidu USA LLC.)*; Dingfu Zhou (Baidu); Rui Zhu (UCSD); chenye guan (baidu); Yuchao Dai (Northwestern Polytechnical University); Hao Su (UCSD); HONGDONG LI (Australian National University, Australia); Yang Ruigang (Baidu)\",Datasets and Evaluation,3D from Single Image; Robotics + Driving,Poster,2.1,91,D,,,\n3223,Compressing Unknown Classes with Product Quantizer for Efficient Zero-Shot Classification,Jin Li (Xi'an Jiaotong University); Xuguang Lan (Xi'an Jiaotong University)*; Yang Liu (Xidian University); Le Wang (Xi'an Jiaotong University); Nanning Zheng (Xi'an Jiaotong University),\"Big Data, Large Scale Methods\",\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,92,D,,,\n3260,Self-Supervised Convolutional Subspace Clustering Network,\"Junjian Zhang (Beijing University of Posts and Telecommunications); Chun-Guang Li (Beijing University of Posts & Telecommunications)*; Chong You (University of California, Berkeley); Xianbiao Qi (Shenzhen Research Institute of Big Data); Honggang Zhang (Beijing University of Posts and Telecommunications); Jun Guo (Beijing University of Posts and Telecommunications); Zhouchen Lin (Peking University)\",Statistical Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,2.1,93,D,,,\n2374,Multi-Scale Geometric Consistency Guided Multi-View Stereo,Qingshan Xu (Huazhong University of Science and Technology); Wenbing Tao (Huazhong University of Science and Technology)*,3D from Multiview and Sensors,Vision Applications and Systems,Poster,2.1,94,E,,,\n2426,Privacy Preserving Image-based Localization,Pablo Speciale (ETH Zurich)*; Johannes L Schönberger (Microsoft); Sing Bing Kang (Microsoft Research); Sudipta Sinha (Microsoft Research); Marc Pollefeys (ETH Zurich / Microsoft),3D from Multiview and Sensors,Computer Vision Theory; Vision Applications and Systems,Poster,2.1,95,E,,,\n2455,SimulCap : Single-View Human Performance Capture with Cloth Simulation,\"Tao Yu (Beihang University)*; Zerong Zheng (Tsinghua University); Yuan Zhong (Tsinghua University); Jianhui Zhao (Beihang University); Qionghai Dai (Tsinghua University); Gerard Pons-Moll (MPII, Germany); Yebin Liu (Tsinghua University)\",3D from Multiview and Sensors,Motion and Tracking; Vision + Graphics,Poster,2.1,96,E,,,\n2471,Hierarchical deep stereo matching on high-resolution images,\"Gengshan Yang (Carnegie Mellon University)*; Joshua Manela (Argo AI); Deva Ramanan (Carnegie Mellon University); Michael Happold (Argo AI, LLC)\",3D from Multiview and Sensors,\"Big Data, Large Scale Methods ; Datasets and Evaluation; Deep Learning ; Low-level Vision; Robotics\",Poster,2.1,97,E,,,\n2491,Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference,Yao Yao (The Hong Kong University of Science and Technology); Zixin Luo (HKUST)*; Shiwei Li (HKUST); Tianwei Shen (HKUST); Tian Fang (HKUST); Long Quan (Hong Kong University of Science and Technology),3D from Multiview and Sensors,,Poster,2.1,98,E,,,\n2503,Synthesizing 3D Shapes from Unannotated Image Collections using Multi-projection Generative Adversarial Networks,Xiao Li (University of Science and Technology of China)*; Yue Dong (Microsoft Research Asia); Pieter Peers (College of William & Mary); Xin Tong (Microsoft),3D from Multiview and Sensors,Physics-based Vision and Shape-from-X; Vision + Graphics,Poster,2.1,99,E,,,\n2555,The Perfect Match: 3D Point Cloud Matching with Smoothed Densities,\"Zan Gojcic (ETH Zürich)*; Zhou Caifa (IGP, ETH Zurich); Jan Dirk Wegner (ETH Zurich); Andreas Wieser (ETH Zürich)\",3D from Multiview and Sensors,\"Recognition: Detection, Categorization, Retrieval; Representation Learning; Scene Analysis and Under\",Poster,2.1,100,E,,,\n2635,Recurrent Neural Network for (Un-)supervised Learning of Monocular Video Visual Odometry and Depth,Rui Wang (University of North Carolina at Chapel Hill)*; Stephen Pizer (University of North Carolina at Chapel Hill); Jan-Michael Frahm (UNC-Chapel Hill),3D from Multiview and Sensors,3D from Single Image; Deep Learning,Poster,2.1,101,E,,,\n2931,PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing,Hengshuang Zhao (The Chinese University of Hong Kong)*; Li Jiang (The Chinese University of Hong Kong); Chi-Wing Fu (The Chinese University of Hong Kong); Jiaya Jia (Chinese University of Hong Kong),3D from Multiview and Sensors,\"Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,2.1,102,E,,,\n3061,Scan2Mesh: From Unstructured Range Scans to 3D Meshes,Angela Dai (Technical University of Munich)*; Matthias Niessner (Technical University of Munich),3D from Multiview and Sensors,Vision + Graphics,Poster,2.1,103,E,,,\n3179,Unsupervised Domain Adaptation for ToF Data Denoising with Adversarial Learning,Gianluca Agresti (University of Padova)*; Henrik Schaefer (Sony Europe Ltd.); Piergiorgio Sartor (Sony Europe Limited); Pietro Zanuttigh (University of Padova),3D from Multiview and Sensors,Datasets and Evaluation; Deep Learning ; RGBD sensors and analytics,Poster,2.1,104,E,,,\n3257,Learning Independent Object Motion from Unlabelled Stereoscopic Videos,Zhe Cao (UC Berkeley)*; Abhishek Kar (Fyusion Inc.); Christian Haene (UC Berkeley); Jitendra Malik (University of California at Berkley),3D from Multiview and Sensors,\"Motion and Tracking; Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,2.1,105,E,,,\n1170,3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans,Ji Hou (Technical University of Munich); Angela Dai (Technical University of Munich); Matthias Niessner (Technical University of Munich)*,3D from Multiview and Sensors,\"Segmentation, Grouping and Shape; Vision Applications and Systems\",Oral,2.1,106,F,,,\n4854,Causes and Corrections for Bimodal Multipath Scanning with Structured Light,yu zhang (Nanjing University )*; Daniel Lau (University of Kentucky); Ying Yu (University of Kentucky),3D from Multiview and Sensors,RGBD sensors and analytics,Oral,2.1,107,F,,,\n7048,TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes,\"Jingwei Huang (Stanford University)*; Haotian Zhang (Stanford University); Li Yi (Stanford); Thomas Funkhouser (Princeton University and Google, Inc.); Matthias Niessner (Technical University of Munich); Leonidas Guibas (Stanford University)\",RGBD sensors and analytics,3D from Multiview and Sensors; Deep Learning ; Vision + Graphics,Oral,2.1,108,F,,,\n704,PlaneRCNN: 3D Plane Detection and Reconstruction from a Single View,Chen Liu (Washington University in St. Louis)*; Kihwan Kim (NVIDIA); Jinwei Gu (NVIDIA); Yasutaka Furukawa (Simon Fraser University); Jan Kautz (NVIDIA),3D from Multiview and Sensors,3D from Single Image; Scene Analysis and Understanding,Oral,2.1,109,F,,,\n3976,Occupancy Networks: Learning 3D Reconstruction in Function Space,\"Lars M Mescheder (MPI-IS and University of Tuebingen)*; Michael Oechsle (MPI-IS, University of Tuebingen and ETAS GmbH); Michael Niemeyer (MPI-IS and University of Tuebingen); Sebastian Nowozin (Google AI Berlin); Andreas Geiger (MPI-IS and University of Tuebingen)\",3D from Single Image,Deep Learning,Oral,2.1,110,F,,,\n2575,3D Shape Reconstruction from Images in the Frequency Domain,\"Weichao Shen (Beijing Institute of Technology)*; Yuwei WU (Beijing Institute of Technology (BIT), China); Yunde Jia (Beijing Institute of Technology)\",3D from Single Image,,Oral,2.1,111,F,,,\n1456,SiCloPe: Silhouette-based Clothed People,Ryota Natsume (Waseda University); Shunsuke Saito (University of Southern California)*; Zeng Huang (University of Southern California); Weikai Chen (USC Institute for Creative Technology); Chongyang Ma (Kwai Inc.); Shigeo Morishima (Waseda Research Institute for Science and Engineering); Hao Li (Pinscreen/University of Southern California/USC ICT),3D from Single Image,\"Face, Gesture, and Body Pose\",Oral,2.1,112,F,,,\n3102,Detailed Human Shape Estimation from a Single Image by Hierarchical Mesh Deformation,\"Hao Zhu (Nanjing University)*; Xinxin Zuo (University of Kentucky); Sen Wang (Northwestern Polytechnical University); Xun Cao (Nanjing University); Ruigang Yang (University of Kentucky, USA)\",3D from Single Image,Deep Learning,Oral,2.1,113,F,,,\n4841,Convolutional Mesh Regression for Single-Image Human Shape Reconstruction,Nikos Kolotouros (University of Pennsylvania)*; Georgios Pavlakos (University of Pennsylvania); Kostas Daniilidis (University of Pennsylvania),3D from Single Image,\"Face, Gesture, and Body Pose\",Oral,2.1,114,F,,,\n2754,H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions,Bugra Tekin (Microsoft)*; Federica Bogo (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft),3D from Single Image,\"Face, Gesture, and Body Pose ; Recognition: Detection, Categorization, Retrieval\",Oral,2.1,115,F,,,\n3419,Learning the Depths of Moving People by Watching Frozen People,Zhengqi Li (Cornell University)*; Tali Dekel (Google); Forrester Cole (Google Research); Richard Tucker (Google); Ce Liu (Google); Bill Freeman (Google); Noah Snavely (Cornell University and Google AI),3D from Single Image,3D from Multiview and Sensors; Deep Learning,Oral,2.1,116,F,,,\n3439,Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion,\"Zhenpei Yang (The University of Texas at Austin); Kristen Grauman (Facebook AI Research & UT Austin); Qixing Huang (The University of Texas at Austin)*; Linjie Luo (Snap Inc); Xiaowei Zhou (Zhejiang Univ., China); Jeffrey Pan (Austin, Texas)\",3D from Single Image,3D from Multiview and Sensors; Deep Learning ; Optimization Methods; RGBD sensors and analytics,Oral,2.1,117,F,,,\n1943,A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images,\"Jiapeng Tang (South China University of Technology); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen))*; Junyi Pan (South China University of Technology); Kui Jia (South China University of Technology); Xin Tong (Microsoft)\",3D from Single Image,Deep Learning,Oral,2.1,118,F,,,\n3451,Structure-And-Motion-Aware Rolling Shutter Correction,Bingbing Zhuang (NUS)*; Quoc-Huy Tran (NEC Labs America); Pan Ji (NEC Labs); Loong Fah Cheong (NUS); Manmohan Chandraker (NEC Labs America),3D from Single Image,3D from Multiview and Sensors,Oral,2.1,119,F,,,\n3871,PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation,\"sida peng (Zhejiang University); Yuan Liu (Zhejiang University); Qixing Huang (The University of Texas at Austin); Hujun Bao (Zhejiang University); Xiaowei Zhou (Zhejiang Univ., China)*\",3D from Single Image,Robotics + Driving; Scene Analysis and Understanding,Oral,2.1,120,F,,,\n2423,Learning Single-Image Depth from Videos using Quality Assessment Networks,\"Weifeng Chen (University of Michigan, Ann Arbor)*; Shengyi Qian (University of Michigan, Ann Arbor); Jia Deng (Princeton University)\",3D from Single Image,Datasets and Evaluation,Poster,2.1,121,F,,,\n2460,Learning 3D Human Dynamics from Video,Angjoo Kanazawa (University of California Berkeley)*; Jason Zhang (University of California Berkeley); Panna Felsen (University of California Berkeley); Jitendra Malik (University of California at Berkley),3D from Single Image,\"3D from Multiview and Sensors; Face, Gesture, and Body Pose ; Video Analytics\",Poster,2.1,122,F,,,\n2993,Lending Orientation to Neural Networks for Cross-view Geo-localization,\"Liu liu (ANU (Australian National University))*; HONGDONG LI (Australian National University, Australia)\",3D from Single Image,3D from Multiview and Sensors; Scene Analysis and Understanding,Poster,2.1,123,F,,,\n3020,Unsupervised Learning of Depth from Defocus Using a Gaussian PSF Layer,\"Shir Gur (Tel Aviv University); Lior Wolf (Tel Aviv University, Israel)*\",3D from Single Image,Deep Learning,Poster,2.1,124,F,,,\n3091,Bilateral Cyclic Constraint and Adaptive Regularization for Monocular Depth Prediction,\"Alex Wong (University of California, Los Angeles)*; Stefano Soatto (UCLA)\",3D from Single Image,,Poster,2.1,125,F,,,\n2207,Face Parsing with RoI Tanh-warping,Jinpeng Lin (XiaMenUniversity); Hao Yang (Microsoft Research Asia)*; Dong Chen (Microsoft Research Asia); Ming Zeng (Software School of Xiamen University); Fang Wen (Microsoft Research Asia ); Lu Yuan (Microsoft),\"Face, Gesture, and Body Pose\",,Poster,2.1,126,G,,,\n2248,Multi-person Articulated Tracking with Spatial and Temporal Embeddings,Sheng Jin (Tsinghua University); Wentao Liu (Sensetime)*; Wanli Ouyang (The University of Sydney); Chen Qian (SenseTime),\"Face, Gesture, and Body Pose\",Motion and Tracking,Poster,2.1,127,G,,,\n2345,Multi-Person Pose Estimation with Enhanced Channel-wise and Spatial Information,Kai Su (Southeast University)*; Dongdong Yu (ByteDance); Zhenqi Xu (Bytedance); Xin Geng (Southeast University); Changhu Wang (ByteDance.Inc),\"Face, Gesture, and Body Pose\",Deep Learning,Poster,2.1,128,G,,,\n2958,A Compact Embedding for Facial Expression Similarity,Raviteja Vemulapalli (Google)*; Aseem Agarwala (Google),\"Face, Gesture, and Body Pose\",\"Datasets and Evaluation; Deep Learning ; Recognition: Detection, Categorization, Retrieval; Represen\",Poster,2.1,129,G,,,\n3142,High-Resolution Representation Learning for Human Pose Estimation,Ke Sun (University of Science and Technology of China); Bin Xiao (MSR Asia); Dong Liu (University of Science and Technology of China); Jingdong Wang (Microsoft Research)*,\"Face, Gesture, and Body Pose\",,Poster,2.1,130,G,,,\n3198,Feature Transfer Learning for Face Recognition with Under-Represented Data,Xi Yin (Microsoft Could & AI)*; Xiang Yu (NEC Labs); Kihyuk Sohn (NEC Labs America); Xiaoming Liu (Michigan State University); Manmohan Chandraker (NEC Labs America),\"Face, Gesture, and Body Pose\",Biometrics,Poster,2.1,131,G,,,\n3250,Unsupervised 3D Pose Estimation with Geometric Self-Supervision,Ching-Hang Chen (Amazon Inc.)*; Ambrish Tyagi (Amazon); Dylan Drover (Amazon Lab126); Rohith MV (Amazon Lab126); James Rehg (Georgia Institute of Technology); Stefan Stojanov (Georgia Institute of Technology); Amit Agrawal (Amazon),\"Face, Gesture, and Body Pose\",3D from Single Image; Deep Learning,Poster,2.1,132,G,,,\n2601,Peeking into the future: Predicting Future Person Activities and Locations in Videos,Junwei Liang (Carnegie Mellon University)*; Lu Jiang (Google); Juan Carlos Niebles (Stanford University); Alexander Hauptmann (Carnegie Mellon University); Li Fei-Fei (Stanford University),Video Analytics,Action Recognition ; Scene Analysis and Understanding,Poster,2.1,133,H,,,\n2953,Re-Identification with Consistent Attentive Siamese Networks,\"Meng Zheng (Rensselaer Polytechnic Institute); Srikrishna Karanam (Siemens Corporate Technology, Princeton)*; Ziyan Wu (Siemens Corporation); Richard Radke (Rensselaer Polytechnic Institute)\",Video Analytics,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,134,H,,,\n236,Learning Optical Flow with Occlusion Hallucination,Pengpeng Liu (The Chinese University of Hong Kong)*; Michael Lyu (The Chinese University of Hong Kong); Irwin King (The Chinese University of Hong Kong); Jia Xu (Tencent AI Lab),Motion and Tracking,Deep Learning,Oral,2.1,135,I,,,\n3963,Taking a Deeper Look at the Inverse Compositional Algorithm,Zhaoyang Lv (GEORGIA TECH)*; Frank Dellaert (Georgia Tech); James Rehg (Georgia Institute of Technology); Andreas Geiger (MPI-IS and University of Tuebingen),Motion and Tracking,3D from Multiview and Sensors; Optimization Methods,Oral,2.1,136,I,,,\n1197,Deeper and Wider Siamese Networks for Real-Time Visual Tracking,Zhipeng Zhang (Chinese Academy of Sciences); Houwen Peng (Microsoft Research)*,Motion and Tracking,,Oral,2.1,137,I,,,\n952,High Fidelity Facial Performance Tracking In-the-wild,Jae Shin Yoon (University of Minnestoa)*; Takaaki Shiratori (Facebook Reality Labs); Shoou-I Yu (Oculus Research Pittsburgh); Hyun Soo Park (The University of Minnesota),\"Face, Gesture, and Body Pose\",3D from Single Image; Motion and Tracking,Oral,2.1,138,I,,,\n2738,Diverse Generation for Multi-agent Sports Games,Raymond A Yeh (UIUC)*; Alexander Schwing (UIUC); Jonathan Huang (Google); Kevin Murphy (Google),Motion and Tracking,Deep Learning,Oral,2.1,139,I,,,\n3444,Efficient Online Multi-Person 2D Pose Tracking with Recurrent Spatio-Temporal Affinity Fields,Yaadhav Raaj (CMU)*; Haroon Idrees (Carnegie Mellon University); Gines Hidalgo Martinez (Carnegie Mellon University); Yaser Sheikh (CMU),\"Face, Gesture, and Body Pose\",Action Recognition ; Motion and Tracking; Video Analytics,Oral,2.1,140,I,,,\n1391,GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching,\"Simone Melzi (University of Verona)*; Riccardo Spezialetti (Universita' degli studi di Bologna); Federico Tombari (Technical University of Munich, Germany); Michael Bronstein (Università della Svizzera Italiana); Luigi Di Stefano (University of Bologna); Emanuele Rodola (Sapienza University of Rome)\",Motion and Tracking,\"3D from Multiview and Sensors; Low-level Vision; Recognition: Detection, Categorization, Retrieval;\",Oral,2.1,141,I,,,\n6191,Eliminating Exposure Bias and Loss-Evaluation Mismatch in Multiple Object Tracking,\"Pascal Fua (EPFL, Switzerland); Andrii Maksai (EPFL)*\",Motion and Tracking,Deep Learning,Oral,2.1,142,I,,,\n3119,Graph Convolutional Tracking,\"Junyu Gao (National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences)*; Tianzhu Zhang (CAS, China); Changsheng Xu (CASIA)\",Motion and Tracking,,Oral,2.1,143,I,,,\n4984,ATOM: Accurate Tracking by Overlap Maximization,Martin Danelljan (ETH Zurich)*; Goutam Bhat (ETH Zurich); Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Michael Felsberg (Linköping University),Motion and Tracking,,Oral,2.1,144,I,,,\n1202,Visual Tracking via Adaptive Spatially-Regularized Correlation Filters,Kenan Dai (Dalian University of Technology); Dong Wang (Dalian University of Technology)*; Huchuan Lu (Dalian University of Technology); Chong Sun (Dalian University of Technology); Jianhua Li (Dalian University of Technology),Motion and Tracking,,Oral,2.1,145,I,,,\n496,Deep Tree Learning for Zero-shot Face Anti-Spoofing,Yaojie Liu (Michigan State University)*; Joel Stehouwer (Michigan State University); Amin Jourabloo (Michigan State University); Xiaoming Liu (Michigan State University),Biometrics,\"Face, Gesture, and Body Pose\",Oral,2.1,146,I,,,\n1140,ArcFace: Additive Angular Margin Loss for Deep Face Recognition,Jiankang Deng (Imperial College London)*; Jia Guo (DeepInsight); Niannan Xue (Imperial College London); Stefanos Zafeiriou (Imperial College Londong),Biometrics,\"Face, Gesture, and Body Pose ; Recognition: Detection, Categorization, Retrieval\",Oral,2.1,147,I,,,\n1617,Learning Joint Unique-Gait and Cross-Gait Representation by Minimizing Quintuplet Loss,\"Kaihao Zhang (Australian National University)*; Wenhan Luo (Tencent AI Lab); Lin Ma (Tencent AI Lab); Wei Liu (Tencent); HONGDONG LI (Australian National University, Australia)\",Biometrics,,Oral,2.1,148,I,,,\n4898,Gait Recognition via Disentangled Representation Learning,Ziyuan Zhang (Michigan State University)*; Luan Tran (Michigan State University); Xi Yin (Microsoft Could & AI); Yousef A Atoum (Yarmouk University); Xiaoming Liu (Michigan State University); Nanxin Wang (Ford Motor Company); Jian Wan (Ford Motor Company),Biometrics,\"Face, Gesture, and Body Pose ; Representation Learning; Vision Applications and Systems\",Oral,2.1,149,I,,,\n2448,On the Continuity of Rotation Representation in Neural Networks,Yi Zhou (University of Southern California)*; Connelly Barnes (University of Virginia); Jingwan Lu (Adobe Research); Jimei Yang (Adobe); Hao Li (Pinscreen/University of Southern California/USC ICT),Motion and Tracking,\"3D from Single Image; Deep Learning ; Face, Gesture, and Body Pose ; Robotics + Driving\",Poster,2.1,150,I,,,\n2597,Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation,Junhwa Hur (TU Darmstadt)*; Stefan Roth (TU Darmstadt),Motion and Tracking,Deep Learning,Poster,2.1,151,I,,,\n2619,Inverse Discriminative Networks for Handwritten Signature Verification,Huan Li (Xi’an Jiaotong University); Ping Wei (Xi'an Jiaotong University)*; Ping Hu (Xi'an Jiaotong University),Biometrics,\"Datasets and Evaluation; Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,152,I,,,\n2656,Led3D: A Lightweight and Efficient Deep Approach to Recognizing Low-quality 3D Faces,\"Guodong Mu (Beihang University); Di Huang (Beihang University, China)*; Guosheng Hu (AnyVision); Jia Sun (Beihang University); Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China)\",Biometrics,\"Face, Gesture, and Body Pose\",Poster,2.1,153,I,,,\n2985,ROI Pooled Correlation Filters for Visual Tracking,Yuxuan Sun (Dalian University of Technology)*; Chong Sun (Tencent Youtu Lab); Dong Wang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology); You He (Naval Aviation University),Motion and Tracking,,Poster,2.1,154,I,,,\n1345,Deep Video Inpainting,Dahun Kim (KAIST)*; Sanghyun Woo (KAIST); Joon-Young Lee (Adobe Research); In So Kweon (KAIST),Image and Video Synthesis,Computational Photography; Deep Learning,Poster,2.1,155,J,,,\n2446,DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-image Synthesis,\"Minfeng Zhu (State Key Lab of CAD&CG, Zhejiang University)*; Pingbo Pan (University of Technology Sydney); Wei Chen (Zhejiang University); Yi Yang (UTS)\",Image and Video Synthesis,Deep Learning ; Vision + Language,Poster,2.1,156,J,,,\n2525,Non-Adversarial Image Synthesis with Generative Latent Nearest Neighbors,Yedid Hoshen (Facebook AI Research (FAIR))*; Ke Li (UC Berkeley); Jitendra Malik (University of California at Berkley),Image and Video Synthesis,,Poster,2.1,157,J,,,\n2651,Mixture Density Generative Adversarial Networks,\"Hamid Eghbal-zadeh (LIT AI Lab & Johannes Kepler University, Institute of Computational Perception)*; Werner Zellinger (Johannes Kepler University, Institute of Knowledge-Based Mathematical Systems); Gerhard Widmer (Johannes Kepler University)\",Image and Video Synthesis,Deep Learning,Poster,2.1,158,J,,,\n2669,SketchGAN: Joint Sketch Completion and Recognition with Generative Adversarial Network,\"Fang Liu (Institute of Software Chinese Academy of Sciences, University of Chinese Academy of Sciences); Xiaoming Deng (Institute of Software, Chinese Academy of Sciences); Yukun Lai (Cardiff University); Yong-Jin Liu (Tsinghua University); Cuixia Ma (Institute of Software Chinese Academy of Sciences)*; Hongan Wang (Institute of Software, Chinese Academy of Sciences)\",Image and Video Synthesis,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,159,J,,,\n2672,Foreground-aware Image Inpainting,Wei Xiong (University of Rochester)*; Jiahui Yu (UIUC); Zhe Lin (Adobe Research); Jimei Yang (Adobe); Xin Lu (Adobe); Connelly Barnes (University of Virginia); Jiebo Luo (U. Rochester),Image and Video Synthesis,Vision + Graphics,Poster,2.1,160,J,,,\n2711,Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation,Matteo Tomei (University of Modena and Reggio Emilia); Marcella Cornia (University of Modena and Reggio Emilia); Lorenzo Baraldi (University of Modena and Reggio Emilia)*; Rita Cucchiara (Universita Di Modena E Reggio Emilia),Image and Video Synthesis,Deep Learning ; Vision Applications and Systems,Poster,2.1,161,J,,,\n2786,Structure-Preserving Stereoscopic View Synthesis with Multi-Scale Adversarial Correlation Matching,Yu Zhang (SenseTime Research)*; Dongqing Zou (SenseTime Research); Jimmy Ren (SenseTime Research); Zhe Jiang (SenseTime Research); Xiaohao Chen (SenseTime Research),Image and Video Synthesis,3D from Single Image; Deep Learning,Poster,2.1,162,J,,,\n2998,DynTypo: Example-based Dynamic Text Effects Transfer,Yifang Men (Peking University)*; Zhouhui Lian (Peking University); Jianguo Xiao (PKU); Yingmin Tang (PKU),Image and Video Synthesis,Low-level Vision; Vision Applications and Systems,Poster,2.1,163,J,,,\n3151,Arbitrary Style Transfer with Style-Attentional Networks,Dae Young Park (AIRI); Kwang Hee Lee (Boeing Korea Engineering and Technology Center (BKETC))*,Image and Video Synthesis,Deep Learning ; Representation Learning; Vision Applications and Systems,Poster,2.1,164,J,,,\n3159,Typography with Decor: Intelligent Text Style Transfer,Wenjing Wang (Peking University)*; Jiaying Liu (Peking University); Shuai Yang ( Peking University); Zongming Guo (Peking University),Image and Video Synthesis,Computational Photography; Low-level Vision,Poster,2.1,165,J,,,\n2331,RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion,Muhammad Sarmad (KAIST)*; Hyunjoo Jenny Lee (KAIST-BMMLab); Young Min Kim (KIST),Vision + Graphics,\"3D from Multiview and Sensors; Deep Learning ; Segmentation, Grouping and Shape\",Poster,2.1,166,K,,,\n2432,Photo Wake-Up: 3D Character Animation from a Single Photo,Chung-Yi Weng (University of Washington)*; Brian Curless (University of Washington); Ira Kemelmacher-Shlizerman (University of Washington + Facebook),Vision + Graphics,3D from Single Image; Computational Photography; Image and Video Synthesis,Poster,2.1,167,K,,,\n2444,Learning to Light for Mobile Mixed Reality in Unconstrained Environments,Chloe LeGendre (Google Inc.)*; Wan-Chun Alex Ma (Google Inc.); Graham Fyffe (Google Inc.); John P Flynn (Google Inc); Laurent Charbonnel (Google Inc.); Jay Busch (Google Inc.); Paul E Debevec (Google VR),Vision + Graphics,Deep Learning ; Image and Video Synthesis; Scene Analysis and Understanding,Poster,2.1,168,K,,,\n2823,Iterative Residual CNNs for Burst Photography Applications,Filippos Kokkinos (Skolkovo Institute of Science and Technology)*; Stamatis Lefkimmiatis (Skolkovo Institute of Science and Technology),Computational Photography,Deep Learning ; Low-level Vision; Optimization Methods,Poster,2.1,169,K,,,\n2932,Learning Implicit Fields for Generative Shape Modeling,Zhiqin Chen (Simon Fraser University)*; Hao Zhang (Simon Fraser University),Vision + Graphics,3D from Single Image; Deep Learning,Poster,2.1,170,K,,,\n3227,Reliable and Efficient Image Cropping: A Grid Anchor based Approach,\"hui zeng (The Hong Kong Polytechnic University)*; lida li (The Hong Kong Polytechnic University); zisheng cao (Da-Jiang Innovations); Lei Zhang (\"\"Hong Kong Polytechnic University, Hong Kong, China\"\")\",Computational Photography,Datasets and Evaluation,Poster,2.1,171,K,,,\n3312,Patch-based Progressive 3D Point Set Upsampling,Yifan Wang (ETH Zurich)*; Shihao Wu (ETH Zurich); Hui Huang (Shenzhen University); Daniel Cohen-Or (Tel Aviv University); Olga Sorkine-Hornung (ETH Zurich),Vision + Graphics,3D from Multiview and Sensors; Deep Learning,Poster,2.1,172,K,,,\n1405,An Iterative and Cooperative Top-down and Bottom-up Inference Network for Salient Object Detection,Wenguan Wang (Inception Institute of Artificial Intelligence); Jianbing Shen (Beijing Institute of Technology)*; Ming-Ming Cheng (Nankai University); Ling Shao (Inception Institute of Artificial Intelligence),Low-level Vision,Image and Video Synthesis,Poster,2.1,173,L,,,\n1936,Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring,\"Hongguang Zhang (Australian National University)*; Yuchao Dai (Northwestern Polytechnical University); HONGDONG LI (Australian National University, Australia); Piotr Koniusz (Data61/CSIRO, ANU)\",Low-level Vision,Computational Photography; Deep Learning ; Motion and Tracking,Poster,2.1,174,L,,,\n2258,Turn a Silicon Camera into an InGaAs Camera,Feifan Lv (Beihang University); Yinqiang Zheng (National Institute of Informatics); BoHan Zhang (Beihang University); Feng Lu (Beihang University)*,Low-level Vision,Image and Video Synthesis,Poster,2.1,175,L,,,\n2318,Low-rank Tensor Completion with a New Tensor Nuclear Norm Induced by Invertible Linear Transforms,\"Canyi Lu (Carnegie Mellon University)*; Xi Peng (College of Computer Science, Sichuan Univerisity); Yunchao Wei (UIUC)\",Optimization Methods,Representation Learning; Statistical Learning,Poster,2.1,176,L,,,\n2348,Joint Representative Selection and Feature Learning: A Semi-Supervised Approach,\"Suchen Wang (Nanyang Technological University)*; Jingjing Meng (State University of New York at Buffalo); Junsong Yuan (\"\"State University of New York at Buffalo, USA\"\"); Yap-Peng Tan (Nanyang Technological University, Singapore)\",Optimization Methods,Representation Learning; Video Analytics; Vision Applications and Systems,Poster,2.1,177,L,,,\n2468,The Domain Transform Solver,Akash Bapat (University of North Carolina at Chapel Hill)*; Jan-Michael Frahm (UNC-Chapel Hill),Optimization Methods,3D from Multiview and Sensors,Poster,2.1,178,L,,,\n2483,CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection,Lu Zhang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology)*; Zhe Lin (Adobe Research); Jianming Zhang (Adobe Research); You He (Naval Aviation University),Low-level Vision,,Poster,2.1,179,L,,,\n2485,Phase-only Image Based Kernel Estimation for Single-image Blind Deblurring,\"Liyuan Pan (The Australian National University)*; Miaomiao Liu (The Australian National University); Yuchao Dai (Northwestern Polytechnical University); RICHARD HARTLEY (Australian National University, Australia)\",Low-level Vision,Computer Vision Theory,Poster,2.1,180,L,,,\n2752,Hierarchical Discrete Distribution Decomposition for Match Density Estimation,Zhichao Yin (UC Berkeley); Trevor Darrell (UC Berkeley); Fisher Yu (UC Berkeley)*,Low-level Vision,Deep Learning ; Motion and Tracking,Poster,2.1,181,L,,,\n2791,FOCNet: A Fractional Optimal Control Network for Image Denoising,\"Xixi Jia (Xidian University)*; Sanyang Liu ( Xidian University); Xiangchu Feng ( Xidian University); Lei Zhang (\"\"Hong Kong Polytechnic University, Hong Kong, China\"\")\",Low-level Vision,Deep Learning,Poster,2.1,182,L,,,\n2840,Orthogonal Decomposition Network for Pixel-wise Binary Classification,\"Chang Liu (University of Chinese Academy of Sciences)*; Fang Wan (University of Chinese Academy of Sciences); Wei Ke (University of Chinese Academy of Sciences); Zhuowei Xiao (Institute of Geology and Geophysics, Chinese Academy of Sciences); Yuan Yao (University of Chinese Academy of Sciences); Xiaosong Zhang (University of Chinese Academy of Sciences); Qixiang Ye (University of Chinese Academy of Sciences, China)\",Low-level Vision,\"Recognition: Detection, Categorization, Retrieval; Representation Learning; Segmentation, Grouping a\",Poster,2.1,183,L,,,\n2984,Multi-source weak supervision for saliency detection,Yu Zeng (Dalian University of Technology)*; Huchuan Lu (Dalian University of Technology); Lihe Zhang (Dalian University of Technology); Yunzhi Zhuge (Dalian University of Technology); Mingyang Qian (Dalian University of Technology); Yizhou Yu (Deepwise AI Lab),Low-level Vision,Scene Analysis and Understanding,Poster,2.1,184,L,,,\n2999,ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples,Xiaojun Jia (Institute of Information Engineering，Chinese Academy of Sciences); Xingxing Wei (Tsinghua University); Xiaochun Cao (Chinese Academy of Sciences)*; Hassan Foroosh (University of Central Florida),Low-level Vision,Deep Learning,Poster,2.1,185,L,,,\n3062,Combinatorial persistency criteria for multicut and max-cut,\"Jan-Hendrik Lange (MPI for Informatics)*; Bjoern Andres (University of Tübingen, Bosch Center for AI); Paul Swoboda (MPI for Informatics)\",Optimization Methods,\"Segmentation, Grouping and Shape\",Poster,2.1,186,L,,,\n3132,S4Net: Single Stage Salient-Instance Segmentation,Ruochen Fan (Tsinghua University); Ming-Ming Cheng (Nankai University)*; Qibin Hou (Nankai University); Tai-Jiang Mu (Tsinghua University); Jingdong Wang (Microsoft Research); Shimin Hu (Tsinghua University),Low-level Vision,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,2.1,187,L,,,\n3258,A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem,\"Ganzhao Yuan (Sun Yat-Sen University)*; Li Shen (Tencent AI Lab); WEI-SHI ZHENG (Sun Yat-sen University, China)\",Optimization Methods,,Poster,2.1,188,L,,,\n2334,Polynomial Representation for Persistence Diagram,\"Zhichao Wang (Tsinghua University)*; Qian Li (University of Technology Sydney); Gang Li (Deakin Univeristy, Australia); Guandong Xu (University of Technology Sydney, Australia)\",Representation Learning,\"Computer Vision Theory; Medical, Biological and Cell Microscopy; Recognition: Detection, Categorizat\",Poster,2.1,189,M,,,\n2336,Crowd Counting and Density Estimation by Trellis Encoder-Decoder Network,\"Xiaolong Jiang (Beihang Unviersity)*; Zehao Xiao (Beihang University); Baochang Zhang (Beihang University); Xiantong Zhen (Inception Institute of Artificial Intelligence); Xianbin Cao (Beihang University, China); David Doermann (University at Buffalo); Ling Shao (Inception Institute of Artificial Intelligence)\",Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,2.1,190,M,,,\n2492,Cross-atlas Convolution for Parameterization Invariant Learning on Textured Mesh Surface,Shiwei Li (HKUST)*; Zixin Luo (HKUST); Mingmin Zhen (Hong Kong University of Science and Technology); Yao Yao (The Hong Kong University of Science and Technology); Tianwei Shen (HKUST); Tian Fang (Altizure); Long Quan (Hong Kong University of Science and Technology),Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Poster,2.1,191,M,,,\n2699,Deep Surface Normal Estimation with Hierarchical RGB-D Fusion,Jin Zeng (SenseTime Research)*; Yanfeng Tong (Beijing Institute of Technology); Yunmu Huang (SenseTime Research); Qiong Yan (SenseTime Group Limited); Wenxiu Sun (SenseTime Research); Jing Chen (Beijing Institute of Technology); Yongtian Wang (Beijing Institute of Technology),Scene Analysis and Understanding,Deep Learning,Poster,2.1,192,M,,,\n2807,Knowledge-Embedded Routing Network for Scene Graph Generation,Tianshui Chen (Sun Yat-Sen University)*; Weihao Yu (Sun Yat-sen University); Riquan Chen (Sun Yat-Sen University); Liang Lin (Sun Yat-sen University),Scene Analysis and Understanding,Visual Reasoning,Poster,2.1,193,M,,,\n2873,An End-to-end Network for Panoptic Segmentation,\"huanyu liu (Zhejiang University)*; Chao Peng (Megvii(Face++) Inc); Changqian Yu (Huazhong University of Science and Technology); Jingbo Wang (Peking University); Xu Liu (The University of Tokyo); Gang Yu (Face++); Wei Jiang (Department of Control Science and Engineering, Zhejiang University)\",Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Robotics + Driving; Segmentation,\",Poster,2.1,194,M,,,\n2950,Fast and Flexible Indoor Scene Synthesis with Deep Convolutional Generative Models,Daniel Ritchie (Brown University)*; Kai Wang (Brown University); Yu-An Lin (Brown University),Scene Analysis and Understanding,Deep Learning ; Vision + Graphics,Poster,2.1,195,M,,,\n3081,Marginalized Latent Semantic Encoder for Zero-Shot Learning,Zhengming Ding (IUPUI)*; Hongfu Liu (Brandeis University),Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,196,M,,,\n3177,Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation,Jaime Spencer Martin (University of Surrey)*; Richard Bowden (University of Surrey); Simon Hadfield (University of Surrey),Representation Learning,\"Deep Learning ; Robotics + Driving; Segmentation, Grouping and Shape\",Poster,2.1,197,M,,,\n3213,Unsupervised Embedding Learning Using Invariant and Spreading Instance Feature,\"Mang YE (Hong Kong Baptist University)*; Xu Zhang (Columbia University); PongChi Yuen (Department of Computer Science, Hong Kong Baptist University); Shih-Fu Chang (Columbia University)\",Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.1,198,M,,,\n3256,Learning Deep Compositional Grammatical Architectures for Visual Recognition,Xilai Li (NC State University); Tianfu Wu (NC State University)*; Xi Song (None),Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,199,M,,,\n3270,A Robust Local Spectral Descriptor for Matching Non-Rigid Shapes with Incompatible Shape Structures,\"Yiqun Wang (NLPR, Institute of Automation, Chinese Academy of Sciences)*; Jianwei Guo (NLPR, Institute of Automation, Chinese Academy of Sciences); Yan Dong-Ming (NLPR, CASIA); Kai Wang (CNRS, GIPSA-lab); Xiaopeng Zhang (Institute of Automation, Chinese Academy of Sciences)\",Scene Analysis and Understanding,\"Face, Gesture, and Body Pose ; Low-level Vision; Vision + Graphics\",Poster,2.1,200,M,,,\n1800,Context and Attribute Grounded Dense Captioning,\"Guojun Yin (University of Science and Technology of China); Lu Sheng (The Chinese University of Hong Kong)*; Bin Liu (University of Science and Technology of China); Nenghai Yu (University of Science and Technology of China); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Jing Shao (Sensetime)\",Vision + Language,Scene Analysis and Understanding,Poster,2.1,201,N,,,\n2206,Spot and Learn: A Maximum-Entropy Image Patch Sampler for Few-Shot Classification,Wen-Hsuan Chu (Carnegie Mellon University)*; Jing-Cheng Chang (National Taiwan University); Yu-Jhe Li (National Taiwan University); Yu-Chiang Frank Wang (National Taiwan University),Visual Reasoning,Deep Learning,Poster,2.1,202,N,,,\n2349,Interpreting CNNs via Explanatory Trees,\"Quanshi Zhang (Shanghai Jiao Tong University)*; Yu Yang (UCLA); Haotian Ma (Southern University of Science and Technology); Ying Nian Wu (University of California, Los Angeles)\",Visual Reasoning,\"Deep Learning ; Others; Representation Learning; Segmentation, Grouping and Shape; Statistical Learn\",Poster,2.1,203,N,,,\n2380,Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning,Dong-Jin Kim (KAIST)*; Jinsoo Choi (KAIST); Tae-Hyun Oh (MIT CSAIL); In So Kweon (KAIST),Vision + Language,Deep Learning,Poster,2.1,204,N,,,\n2401,Deep Modular Co-Attention Networks for Visual Question Answering,Zhou Yu (Hangzhou Dianzi University)*; Jun Yu (HDU); Yuhao Cui (Hangzhou Dianzi University); Dacheng Tao (University of Sydney); Qi Tian (Huawei Noah’s Ark Lab),Vision + Language,Visual Reasoning,Poster,2.1,205,N,,,\n2591,Synthesizing Environment-Aware Activities via Activity Sketches,Yuan-Hong Liao (National Tsing Hua University)*; Sanja Fidler (University of Toronto); Antonio Torralba (MIT); Xavier Puig (MIT); Marko Boben (University of Ljubljana),Vision + Language,Datasets and Evaluation,Poster,2.1,206,N,,,\n2710,\"Show, Control and Tell: A Framework for Generating Grounded and Controllable Captions\",Marcella Cornia (University of Modena and Reggio Emilia); Lorenzo Baraldi (University of Modena and Reggio Emilia)*; Rita Cucchiara (Universita Di Modena E Reggio Emilia),Vision + Language,Deep Learning ; Visual Reasoning,Poster,2.1,207,N,,,\n2730,Multi-target Embodied Question Answering,Licheng Yu (University of North Carolina at Chapel Hill)*; Xinlei Chen (Facebook AI Research); Georgia Gkioxari (Facebook); Mohit Bansal (University of North Carolina at Chapel Hill); Tamara Berg (University on North carolina); Dhruv Batra (Georgia Tech & Facebook AI Research),Vision + Language,Visual Reasoning,Poster,2.1,208,N,,,\n2795,Visual question answering as reading comprehension,Hui Li (the University of Adelaide)*; Peng Wang (Northwestern Polytechnical University); Chunhua Shen (University of Adelaide); Anton van den Hengel (University of Adelaide),Vision + Language,Visual Reasoning,Poster,2.1,209,N,,,\n3040,StoryGAN: A Sequential Conditional GAN for Story Visualization,Yitong Li (Duke University)*; Jianfeng Gao (Microsoft Research); David Carlson (Duke); Yu Cheng (Microsoft); Zhe Gan (Microsoft); Jingjing Liu (Microsoft); Lawrence Carin Duke (CS); Yelong Shen (Microsoft); Yuexin Wu (Carnegie Mellon University),Vision + Language,,Poster,2.1,210,N,,,\n2298,Noise-Aware Unsupervised Deep Lidar-Stereo Fusion,\"Xuelian Cheng (Australian National University); Yiran Zhong (Australian National University); Yuchao Dai (Northwestern Polytechnical University)*; Pan Ji (NEC Laboratories America); HONGDONG LI (Australian National University, Australia)\",Robotics + Driving,3D from Multiview and Sensors; RGBD sensors and analytics,Poster,2.1,211,O,,,\n2513,Versatile Multiple Choice Learning and Its Application to Vision Computing,Kai Tian (Fudan University); Yi Xu (Fudan University); Shuigeng Zhou (Fudan University)*; Jihong Guan (Tongji University),Vision Applications and Systems,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,212,O,,,\n2867,EV-Gait: Event-based Robust Gait Recognition using Dynamic Vision Sensors,\"Yanxiang Wang (Harbin Engineering University); Yiran Shen (Data61, CSIRO)*; Bowen Du (University of Warwick); Kai Wu (Fudan University); Guangrong Zhao (Harbin Engineering University); Jianguo Sun (Harbin Engineering University); Hongkai Wen (University of Warwick)\",Vision Applications and Systems,\"Face, Gesture, and Body Pose ; Others; Recognition: Detection, Categorization, Retrieval\",Poster,2.1,213,O,,,\n2907,Deep Supervised Automatic Tooth Instance Segmentation and Identification from Cone Beam Computed Tomography Images,Zhiming Cui (HKU)*; Changjian Li (The University of Hong Kong); Wenping Wang (The University of Hong Kong),\"Medical, Biological and Cell Microscopy\",\"Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Poster,2.1,214,O,,,\n3007,Modularized Textual Grounding for Counterfactual Resilience,\"Zhiyuan Fang (Arizona State University)*; Shu Kong (University of California, Irvine); Charless Fowlkes (UC Irvine); Yezhou Yang (Arizona State University)\",Vision Applications and Systems,\"Recognition: Detection, Categorization, Retrieval; Vision + Language\",Poster,2.1,215,O,,,\n3124,L3-Net: Towards Learning based LiDAR Localization for Autonomous Driving,Weixin Lu (Baidu ADU); Yao Zhou (Baidu ADU); Guowei Wan (Baidu Company); Shenhua Hou (Baidu ADU); Shiyu Song (Baidu ADU)*,Robotics + Driving,Vision Applications and Systems,Poster,2.1,216,O,,,\n2969,Bidirectional Learning for Domain Adaptation of Semantic Segmentation,Yunsheng Li (UCSD)*; Lu Yuan (Microsoft); Nuno Vasconcelos (UC San Diego),Deep Learning,\"Segmentation, Grouping and Shape\",Poster,2.2,1,A,,,\n3330,Enhanced Bayesian Compression via Deep Reinforcement Learning,Xin Yuan (Tsinghua University); Liangliang Ren (Tsinghua University); Jiwen Lu (Tsinghua University)*; Jie Zhou (Tsinghua University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,2,A,,,\n3387,Strong-Weak Distribution Alignment for Adaptive Object Detection,Kuniaki Saito (Boston University)*; Yoshitaka Ushiku (The University of Tokyo); Tatsuya Harada (The University of Tokyo); Kate Saenko (Boston University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,2.2,3,A,,,\n3429,MFAS: Multimodal Fusion Architecture Search,Juan-Manuel Perez-Rua (Samsung)*; Valentin Vielzeuf (Orange Labs / University of Caen); Stephane Pateux (Orange Labs); Moez Baccouche (Orange Labs); Frederic Jurie (University of Caen),Deep Learning,Others; Representation Learning,Poster,2.2,4,A,,,\n3440,Disentangling Adversarial Robustness and Generalization,David Stutz (Max Planck Institute for Informatics)*; Matthias Hein (University of Tübingen); Bernt Schiele (MPI Informatics),Deep Learning,Datasets and Evaluation; Representation Learning,Poster,2.2,5,A,,,\n3483,ShieldNets: Defending Against Adversarial Attacks using Policy Gradient Reinforcement Learning,\"Rajkumar Theagarajan (University of California, Riverside)*; Ming Chen ( Lawrence Berkeley National Laboratory); BIR BHANU (UC RIVERSIDE, USA.); Jing Zhang (KLA-Tencor)\",Deep Learning,Computer Vision Theory; Others; Vision Applications and Systems,Poster,2.2,6,A,,,\n3487,Deeply-Supervised Knowledge Synergy,Dawei Sun (Tsinghua University); Anbang Yao (Intel Labs China)*; Aojun Zhou (Intel Labs China); Hao Zhao (Intel Labs China),Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,7,A,,,\n3503,Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration,Xing Liu (Tohoku university)*; Masanori Suganuma (RIKEN AIP / Tohoku University); Zhun Sun (RIKEN Center for AIP); Takayuki Okatani (Tohoku University/RIKEN AIP),Deep Learning,Computational Photography; Image and Video Synthesis; Low-level Vision; Vision Applications and Syst,Poster,2.2,8,A,,,\n3543,Probabilistic End-to-end Noise Correction for Learning with Noisy Labels,Kun Yi (Nanjing University); Jianxin Wu (Nanjing University)*,Deep Learning,\"Others; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,9,A,,,\n3547,Attention-guided Unified Network for Panoptic Segmentation,\"Yanwei Li (Institute of Automation, CAS; University of Chinese Academy of Sciences)*; Zheng Zhu (Institute of Automation, Chinese Academy of Sciences); Xinze Chen (Horizon Robotics); Lingxi Xie (Johns Hopkins University); Guan Huang (Horizon Robotics); Xingang Wang (Institute of Automation, CAS); Dalong Du (Horizon Robotics)\",Deep Learning,\"Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,2.2,10,A,,,\n3583,NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection,Golnaz Ghiasi (Google Brain)*; Tsung-Yi Lin (Google Brain); Quoc Le (Google Brain),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,11,A,,,\n3645,OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks,Jiashi Li (Beijing University of Posts and Telecommunications)*; Jingyu Wang (Beijing University of Posts and Telecommunications); Qi Qi (Beijing University of Posts and Telecommunications); Ce Ge (Beijing University of Posts and Telecommunications); Yujian Li (Beijing University of Posts and Telecommunications); Zhangzhang Yue (Beijing University of Posts and Telecommunications); Haifeng Sun (Beijing University of Posts and Telecommunications),Deep Learning,,Poster,2.2,12,A,,,\n3667,Semantically Aligned Bias Reducing Zero Shot Learning,Akanksha Paul (IIT Ropar)*; Narayanan C Krishnan (IIT Ropar); Prateek Munjal (IIT Ropar),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,13,A,,,\n3737,Feature Space Perturbations Yield More Transferable Adversarial Examples,Nathan Inkawhich (Duke University)*; Wei Wen (Duke University); Yiran Chen (Duke University); Hai Li (Duke University),Deep Learning,Others; Vision Applications and Systems,Poster,2.2,14,A,,,\n3787,IGE-Net: Inverse Graphics Energy Networks\\\\for Human Pose Estimation and Single-View Reconstruction,\"Dominic Jack (Queensland University of Technology)*; Frederic Maire (Queensland University of Technology); SAREH SHIRAZI (Queensland University of Technology, Australia); Anders Eriksson (Queensland University of Technology)\",Deep Learning,,Poster,2.2,15,A,,,\n3815,Accelerating Convolutional Neural Networks via Activation Map Compression,Georgios Georgiadis (Samsung)*,Deep Learning,,Poster,2.2,16,A,,,\n3844,Knowledge Distillation via Instance Relationship Graph,\"Yufan Liu (Institute of Automation, Chinese Academy Sciences); Jiajiong Cao (Ant Financial); Bing Li (National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences)*; Chunfeng Yuan (NLPR); Weiming Hu (Institute of Automation，Chinese Academy of Sciences); Yangxi Li (National Computer network Emergency Response technical Team/Coordination Center of China); Yunqiang Duan (National Computer Network Emergency Response Technical Team/Coordination Center of China)\",Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,17,A,,,\n3943,PPGNet: Learning Point-Pair Graph for Line Segment Detection,Ziheng Zhang (Shanghaitech University); Zhengxin Li (ShanghaiTech University)*; Ning Bi (Shanghaitech University); Jia Zheng (ShanghaiTech University); Jinlei Wang (Shanghaitech University); Kun Huang (ShanghaiTech); Weixin Luo (Shanghaitech University); Yanyu Xu (Shanghaitech University); Shenghua Gao (Shanghaitech University),Deep Learning,Vision + Graphics,Poster,2.2,18,A,,,\n3951,Building Detail-Sensitive Semantic Segmentation Networks with Polynomial Pooling,\"Zhen Wei (Institute of Information Engineering, the Chinese Academy of Sciences)*; Jingyi Zhang (University of Electronic Science and Technology of China); Li Liu (the inception institute of artificial intelligence); Fan Zhu (Inception Institute of Artificial Intelligence); Fumin Shen (UESTC); yao sun (iie); si liu (Beihang University); Fan Zhu (the inception institute of artificial intelligence ); Yi Zhou (Inception Institute of Artificial Intelligence); Ling Shao (Inception Institute of Artificial Intelligence)\",Deep Learning,\"Segmentation, Grouping and Shape\",Poster,2.2,19,A,,,\n3967,Variational Bayesian Dropout,Yuhang Liu (Wuhan University); Wenyong Dong (Wuhan University)*; Lei Zhang (The University of Adelaide); Dong Gong (The University of Adelaide); Qinfeng Shi (University of Adelaide),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Statistical Learning\",Poster,2.2,20,A,,,\n4013,AANet: Attribute Attentio Network for Person Re-Identification,Chiat Pin Tay (Nanyang Technological University)*; Sharmili Roy (Nanyang Technological University); Kim Yap (Nanyang Technological University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,2.2,21,A,,,\n4106,Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction,Osama Makansi (University of Freiburg)*; Eddy Ilg (University of Freiburg); Özgün Çiçek (University of Freiburg); Thomas Brox (University of Freiburg),Deep Learning,Robotics + Driving; Statistical Learning,Poster,2.2,22,A,,,\n4129,A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks,Yinghao Xu (Zhejiang University)*; Xin Dong (Harvard Univeristy); Yudian Li (None); Hao Su (UCSD),Deep Learning,Deep Learning,Poster,2.2,23,A,,,\n4131,PointNetLK: Robust & Efficient Point Cloud Registration using PointNet,Hunter M Goforth (Carnegie Mellon University)*; Arun Srivatsan Rangaprasad (Carnegie Mellon University); Yasuhiro Aoki (Fujitsu Laboratries Ltd.); Simon Lucey (CMU),Deep Learning,3D from Multiview and Sensors; Motion and Tracking,Poster,2.2,24,A,,,\n1462,Panoptic Feature Pyramid Network,Alexander Kirillov (Facebook AI Reserach)*; Kaiming He (Facebook AI Research); Ross Girshick (FAIR); Piotr Dollar (FAIR),\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Oral,2.2,25,B,,,\n2705,Mask Scoring R-CNN,Zhaojin Huang (Huazhong University of Science and Technology); Lichao Huang (Horizon Robotics); Yongchao Gong (Horizon Robotics ); Chang Huang (Horizon Robotics); Xinggang Wang (Huazhong Univ. of Science and Technology)*,\"Recognition: Detection, Categorization, Retrieval\",,Oral,2.2,26,B,,,\n3864,Reasoning-RCNN: Unifying Adaptive Global Reasoning into Large-scale Object Detection,Hang Xu (Huawei Noah's Ark Lab); ChenHan Jiang (Sun Yat-sen University); Xiaodan Liang (Sun Yat-sen University)*; Liang Lin (Sun Yat-sen University); Zhenguo Li (Huawei Noah's Ark Lab),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Visual Reasoning,Oral,2.2,27,B,,,\n1476,Cross-Modality Personalization for Retrieval,Nils Murrugarra-Llerena (University of Pittsburgh)*; Adriana Kovashka (University of Pittsburgh),\"Recognition: Detection, Categorization, Retrieval\",Datasets and Evaluation; Vision + Language,Oral,2.2,28,B,,,\n2623,Composing Text and Image for Image Retrieval - An Empirical Odyssey,\"Nam Vo (Georgia Institute of Technology)*; Lu Jiang (Google); Chen Sun (Google); Kevin Murphy (Google); Li-Jia Li (Stanford); Li Fei-Fei (Stanford University); James Hays (Georgia Institute of Technology, USA)\",\"Recognition: Detection, Categorization, Retrieval\",Datasets and Evaluation; Representation Learning; Vision + Language,Oral,2.2,29,B,,,\n3524,Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation,\"Xiaobing Wang (Samsung Research Institute China-Beijing)*; yingying jiang ( Samsung Research China,Beijing); Zhenbo Luo ( Samsung Research Institute China-Beijing); Cheng-lin Liu (Institute of Automation of Chinese Academy of Sciences); Hyunsoo Choi (SAMSUNG ELECTRONICS CO.,LTD); Sungjin Kim (SAMSUNG ELECTRONICS CO.,LTD)\",\"Recognition: Detection, Categorization, Retrieval\",Vision Applications and Systems,Oral,2.2,30,B,,,\n2657,Adaptive NMS: Refining Pedestrian Detection in a Crowd,\"Songtao Liu (BUAA); Di Huang (Beihang University, China)*; Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China)\",\"Recognition: Detection, Categorization, Retrieval\",\"Face, Gesture, and Body Pose\",Oral,2.2,31,B,,,\n3517,\"Point in, Box out: Beyond Counting Persons in Crowds\",yuting liu (sichuan university)*; Miaojing Shi (Inria Rennes); Qijun Zhao (Sichuan University); Xiaofang Wang (Inria Rennes),\"Recognition: Detection, Categorization, Retrieval\",,Oral,2.2,32,B,,,\n6264,Locating Objects Without Bounding Boxes,Javier Ribera (Purdue University)*; David Güera (Purdue University); Yuhao Chen (Purdue University); Edward Delp (Purdue University),\"Recognition: Detection, Categorization, Retrieval\",,Oral,2.2,33,B,,,\n3333,FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery,\"Krishna Kumar Singh (University of California Davis)*; Utkarsh Ojha (University of California, Davis); Yong Jae Lee (University of California, Davis)\",\"Recognition: Detection, Categorization, Retrieval\",Image and Video Synthesis,Oral,2.2,34,B,,,\n3505,Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification,Si Wu (South China University of Technology)*; Jichang Li (South China University of Technology); Cheng Liu (City University of Hong Kong); Zhiwen Yu (South China University of Technology); Hau San Wong (City University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2,35,B,,,\n4012,Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects,\"Yusuke Niitani (Preferred Networks, Inc.)*; Takuya Akiba (Preferred Networks, Inc.); Tommi Kerola (Preferred Networks, Inc.); Toru Ogawa (Preferred Networks, Inc.); Shotaro Sano (Preferred Networks, Inc.); Shuji Suzuki (Preferred Networks, Inc.)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2,36,B,,,\n4099,Curls & Whey: Boosting Black-Box Adversarial Attacks,Yucheng Shi (Tianjin University); Siyu Wang (Tianjin University); Yahong Han (Tianjin University)*,\"Recognition: Detection, Categorization, Retrieval\",Vision Applications and Systems,Oral,2.2,37,B,,,\n5988,Barrage of Random Transforms for Adversarially Robust Defense,Edward Raff (Booz Allen Hamilton)*; Jared Sylvester (Booz Allen Hamilton); Steven Forsyth (Nvidia); Mark McLean (Laboratory for Physical Sciences),\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods ; Deep Learning\",Oral,2.2,38,B,,,\n4648,Aggregation Cross-Entropy for Sequence Recognition,Zecheng Xie (South China University of Technology); Yaoxiong Huang (South China University of Technology); Yuanzhi Zhu (South China University of Technology); Lianwen Jin (South China University of Technology)*; Yuliang Liu (South China University of Technology); Lele Xie (South China University of Technology),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Document Analysis ; Statistical Learning; Vision Applications and Systems,Oral,2.2,39,B,,,\n4674,LaSO: Label-Set Operations networks for multi-label few-shot learning,\"Amit Alfassy (IBM-Research); Leonid Karlinsky (IBM-Research)*; Amit Aides (IBM); Joseph Shtok (IBM-Reseach); Sivan Harary (IBM-Research); Rogerio Feris (IBM Research AI, MIT-IBM Watson AI Lab); Raja Giryes (Tel Aviv University); Alex Bronstein (Technion)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2,40,B,,,\n5352,Few-Shot Learning with Localization in Realistic Settings,Davis Wertheimer (Cornell)*; Bharath Hariharan (Cornell University),\"Recognition: Detection, Categorization, Retrieval\",\"Deep Learning ; Segmentation, Grouping and Shape\",Oral,2.2,41,B,,,\n5575,AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs,Massimiliano Mancini (Sapienza University of Rome)*; Samuel Rota Bulò (Mapillary Research); Barbara Caputo (IIT); Elisa Ricci (FBK - Technologies of Vision),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Oral,2.2,42,B,,,\n2532,Few Shot Adaptive Faster R-CNN,\"tao wang (national university of singapore)*; Jiashi Feng (NUS); Li Yuan (National University of Singapore); Xiaopeng Zhang (Noah's Ark Lab, Huawei Inc.)\",\"Recognition: Detection, Categorization, Retrieval\",Representation Learning; Vision Applications and Systems,Poster,2.2,43,B,,,\n3351,VRSTC: Occlusion-Free Video Person Re-Identification,Ruibing Hou (Institute of Computing Technology，Chinese Academy); Bingpeng MA (UCAS)*; Hong Chang (Chinese Academy of Sciences); Xinqian Gu (University of Chinese Academy of Sciences); Shiguang Shan (Chinese Academy of Sciences); Xilin Chen (China),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.2,44,B,,,\n3356,Compact Feature Learning for Multi-domain Image Classification,Yajing Liu (USTC); Xinmei Tian (USTC)*; Ya Li (IFLYTEK Research); Zhiwei Xiong (University of Science and Technology of China); Feng Wu (University of Science and Technology of China),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.2,45,B,,,\n3360,Adaptive Transfer Network for Cross-Domain Person Re-Identification,Jiawei Liu (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China)*; Di Chen (University of Science and Technology of China); Richang Hong (HeFei University of Technology); Meng Wang (Hefei University of Technology),\"Recognition: Detection, Categorization, Retrieval\",Image and Video Synthesis,Poster,2.2,46,B,,,\n3371,Large-Scale Few-Shot Learning: Knowledge Transfer with Class Hierarchy,Aoxue Li (Peking University); Tiange Luo (Peking University); Zhiwu Lu (Renmin University of China)*; Tao Xiang (University of Surrey); Liwei Wang (Peking University),\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods\",Poster,2.2,47,B,,,\n3466,Moving Object Detection under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition,Moein Shakeri (University of Alberta)*; Zhang Hong (University of Alberta),\"Recognition: Detection, Categorization, Retrieval\",\"Image and Video Synthesis; Representation Learning; Segmentation, Grouping and Shape; Video Analytic\",Poster,2.2,48,B,,,\n3573,Pedestrian Detection with Autoregressive Network Phases,Garrick Brazil (Michigan State University)*; Xiaoming Liu (Michigan State University),\"Recognition: Detection, Categorization, Retrieval\",Robotics + Driving; Scene Analysis and Understanding; Vision Applications and Systems,Poster,2.2,49,B,,,\n3594,All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification,Weijie Chen (Hikvision Research Institute); Yuan Zhang (Hikvision Research Institute); Di Xie (Hikvision Research Institute)*; Shiliang Pu (Hikvision Research Institute),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.2,50,B,,,\n3633,Stochastic Class-based Hard Example Mining for Deep Metric Learning,Yumin Suh (Seoul National University); Bohyung Han (Seoul National University); Wonsik Kim (Samsung Electronics); Kyoung Mu Lee (Seoul National University)*,\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.2,51,B,,,\n3682,Revisiting Local Descriptor based Image-to-Class Measure for Few-shot Learning,\"Wenbin Li (Nanjing University)*; Lei Wang (\"\"University of Wollongong, Australia\"\"); Jinglin Xu (Northwestern Polytechnical University); Jing Huo (Nanjing University); Yang Gao (Nanjing University); Jiebo Luo (U. Rochester)\",\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.2,52,B,,,\n3684,Towards Robust Curve Text Detection with Conditional Spatial Expansion,\"ZICHUAN LIU (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Sheng Yang (Nanyang Technological University); Fayao Liu (University of Adelaide); Weisi Lin (\"\"Nanyang Technological University, Singapore\"\"); Wang Ling Goh (Nanyang Technological University)\",\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.2,53,B,,,\n3750,Revisiting Perspective Information for Efficient Crowd Counting,Miaojing Shi (Inria Rennes)*; Zhaohui Yang (Peking University); Chao Xu (Peking University); Qijun Chen (Tongji University),\"Recognition: Detection, Categorization, Retrieval\",,Poster,2.2,54,B,,,\n3770,Towards Universal Object Detection by Domain Attention,\"Xudong Wang (University of California, San Diego)*; Zhaowei Cai (UCSD); Dashan Gao (12 Sigma Technologies); Nuno Vasconcelos (UCSD, USA)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,2.2,55,B,,,\n3773,Ensemble Deep Manifold Similarity Learning using Hard Proxies,Nicolas Aziere (Oregon State University)*; Sinisa Todorovic (Oregon State U),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,2.2,56,B,,,\n3816,Quantization Networks,\"Jiwei Yang (University of Science and Technology of China); Xu Shen (Alibaba Group); Jun Xing (miHoYo); Xinmei Tian (USTC)*; Houqiang Li (University of Science and Technology of China); Bing Deng (Damo Academy, Alibaba Group); Jianqiang Huang (Alibaba Group); Xiansheng Hua (Damo Academy, Alibaba Group)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.2,57,B,,,\n3890,RES-PCA: A Scalable Approach to Recovering Low-rank Matrices,Chong Peng (Qingdao University)*; Chenglizhao Chen (Qingdao University); Zhao Kang (University of Electronic Science and Technology of China); Jianbo Li (Qingdao University); Qiang Cheng (University of Kentucky),\"Recognition: Detection, Categorization, Retrieval\",Statistical Learning,Poster,2.2,58,B,,,\n3898,Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks,Dinesh Reddy Narapureddy (Carnegie mellon university)*; Minh P Vo (Facebook); Srinivasa G Narasimhan (Carnegie Mellon University),\"Recognition: Detection, Categorization, Retrieval\",3D from Multiview and Sensors; 3D from Single Image; Vision + Graphics,Poster,2.2,59,B,,,\n3924,Efficient Featurized Image Pyramid Network for Single Shot Detector,Yanwei Pang (Tianjin University); Tiancai Wang (Tianjin University)*; Rao Muhammad Anwer (Inception Institute of Artificial Intelligence ); Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Ling Shao (Inception Institute of Artificial Intelligence),\"Recognition: Detection, Categorization, Retrieval\",Scene Analysis and Understanding,Poster,2.2,60,B,,,\n3955,Multi-Task Multi-Sensor Fusion for 3D Object Detection,Ming Liang (Uber ATG); Bin Yang (Uber ATG & University of Toronto)*; Yun Chen (Uber ATG Toronto); Rui Hu (Uber); Raquel Urtasun (Uber ATG),\"Recognition: Detection, Categorization, Retrieval\",3D from Multiview and Sensors; Deep Learning ; Representation Learning; Robotics + Driving,Poster,2.2,61,B,,,\n3961,Domain Specific Batch Normalization for Unsupervised Domain Adaptation,Woong-Gi Chang (POSTECH)*; Tackgeun You (POSTECH); Seonguk Seo (Seoul National University); Suha Kwak (POSTECH); Bohyung Han (Seoul National University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.2,62,B,,,\n3997,Grid R-CNN,Xin Lu (Zhejiang University)*; Buyu LI (The Chinese University of Hong Kong); Yuxin Yue (BeihangUniversity); Quanquan Li (SenseTime Group Limited); Junjie Yan (Sensetime Group Limited),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.2,63,B,,,\n4062,MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition,\"weihe zhang (Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,2.2,64,B,,,\n4087,Image-based Navigation using Visual Features and Map,Janine D Thoma (ETH Zurich)*; Danda Pani Paudel (ETH Zürich); Ajad Chhatkuli (ETH Zurich); Thomas Probst (ETH Zurich); Luc Van Gool (ETH Zurich),\"Recognition: Detection, Categorization, Retrieval\",Computer Vision Theory; Motion and Tracking; Optimization Methods; Robotics + Driving; Vision Applic,Poster,2.2,65,B,,,\n4092,Triply Supervised Decoder Networks for Joint Detection and Segmentation,\"Jiale Cao (Tianjin University); Yanwei Pang (Tianjin University)*; Xuelong Li (Northwestern Polytechnical University, China)\",\"Recognition: Detection, Categorization, Retrieval\",Scene Analysis and Understanding,Poster,2.2,66,B,,,\n4107,Leveraging the Invariant Side of Generative Zero-Shot Learning,Jingjing Li (University of Electronic Science and Technology of China)*; Mengmeng Jing (University of Electronic Science and Technology of China); Ke Lu (University of Electronic Science and Technology of China); Zhengming Ding (Indiana University-Purdue University Indianapolis); Lei Zhu (Shandong Normal Unversity); Zi Huang (University of Queensland),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Image and Video Synthesis,Poster,2.2,67,B,,,\n4171,Exploring the Bounds of the Utility of Context for Object Detection,\"Ehud Barnea (Ben-Gurion University, Beer Sheva, Israel)*; Ohad Ben-Shahar (Ben Gurion University, Israel)\",\"Recognition: Detection, Categorization, Retrieval\",Scene Analysis and Understanding,Poster,2.2,68,B,,,\n3443,A-CNN: Annularly Convolutional Neural Networks on Point Clouds,Artem Komarichev (Wayne State University); Zichun Zhong (Wayne State University)*; Jing Hua (Wayne State University),\"Segmentation, Grouping and Shape\",Vision + Graphics,Poster,2.2,69,C,,,\n3475,DARNet: Deep Active Ray Network for Building Segmentation,Dominic Cheng (University of Toronto)*; Renjie Liao (University of Toronto); Sanja Fidler (University of Toronto); Raquel Urtasun (Uber ATG),\"Segmentation, Grouping and Shape\",Deep Learning ; Vision Applications and Systems,Poster,2.2,70,C,,,\n3631,Point Cloud Oversegmentation with Graph-Structured Deep Metric Learning,loic landrieu (IGN)*; Mohamed Boussaha (IGN),\"Segmentation, Grouping and Shape\",3D from Multiview and Sensors; Deep Learning ; Representation Learning; Scene Analysis and Understan,Poster,2.2,71,C,,,\n3655,Graphonomy: Universal Human Parsing via Graph Transfer Learning,Ke Gong (Sun Yat-sen University ); Yiming Gao (Sun Yat-sen University); Xiaodan Liang (Sun Yat-sen University)*; Xiaohui Shen (ByteDance AI Lab); Meng Wang (Hefei University of Technology); Liang Lin (Sun Yat-sen University),\"Segmentation, Grouping and Shape\",Deep Learning,Poster,2.2,72,C,,,\n3681,Multiple Heterogeneous Models Fitting by Multi-class Cascaded T-linkage,Luca Magri (University of Udine); Andrea Fusiello (UNIUD)*,\"Segmentation, Grouping and Shape\",,Poster,2.2,73,C,,,\n3710,A Late Fusion CNN for Digital Matting,Zhang Yunke (Zhejiang University); Lixue Gong (Zhejiang university); Lubin Fan (Alibaba Group); Peiran Ren (Alibaba ); Qixing Huang (The University of Texas at Austin); Hujun Bao (Zhejiang University); Weiwei Xu (Zhejiang unviersity)*,\"Segmentation, Grouping and Shape\",Low-level Vision,Poster,2.2,74,C,,,\n3784,BASNet: Boundary Aware Salient Object Detection,Xuebin Qin (University of Alberta)*; Zichen Zhang (University of Alberta); Chenyang Huang (University of Alberta); Chao Gao (University of Alberta); Masood Dehghan (University of Alberta); Martin Jagersand (University of Alberta),\"Segmentation, Grouping and Shape\",Deep Learning ; Vision Applications and Systems,Poster,2.2,75,C,,,\n3828,ZigZagNet: Fusing Top-Down and Bottom-Up Context for Object Segmentation,Di Lin (Shenzhen University)*; Dingguo Shen (Shenzhen University); Siting Shen (Shenzhen University); Yuanfeng Ji (Shenzhen University); Dani Lischinski (The Hebrew University of Jerusalem); Daniel Cohen-Or (Tel Aviv University); Hui Huang (Shenzhen University),\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Poster,2.2,76,C,,,\n3977,Object Instance Annotation with Deep Extreme Level Set Evolution,Zian Wang (Tsinghua University)*; David Acuna (University of Toronto); Amlan Kar (University of Toronto); Huan Ling (University of Toronto); Sanja Fidler (University of Toronto),\"Segmentation, Grouping and Shape\",,Poster,2.2,77,C,,,\n4018,Leveraging Crowdsourced GPS Data for Road Extraction from Aerial Imagery,Tao Sun (Tongji University); Zonglin Di (Tongji University); Pengyu Che (Tongji University); Chun Liu (Tongji University); Wang Yin (Tongji University)*,\"Segmentation, Grouping and Shape\",Vision Applications and Systems,Poster,2.2,78,C,,,\n4080,Adaptive Pyramid Context Network for Semantic Segmentation,\"Junjun He (SJTU); Zhongying Deng (SIAT); Lei Zhou (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*\",\"Segmentation, Grouping and Shape\",Deep Learning ; Vision Applications and Systems,Poster,2.2,79,C,,,\n3421,\"Isospectralization, or how to hear shape, style, and correspondence\",Luca Cosmo (University of Venice); Maks Ovsjanikov (Ecole polytechnique); Mikhail Panine (Ecole polytechnique); Arianna Rampini (Sapienza University of Rome)*; Michael Bronstein (Università della Svizzera Italiana); Emanuele Rodola (Sapienza University of Rome),Physics-based Vision and Shape-from-X,\"Segmentation, Grouping and Shape; Vision + Graphics\",Poster,2.2,80,D,,,\n3427,Speech2Face: Learning the Face Behind a Voice,Tali Dekel (Google)*; Tae-Hyun Oh (MIT CSAIL); Changil Kim (MIT CSAIL); Michael Rubinstein (Google); Bill Freeman (Google); Wojciech Matusik (MIT); Inbar Mosseri (Google),\"Big Data, Large Scale Methods\",Deep Learning ; Image and Video Synthesis,Poster,2.2,81,D,,,\n3455,Joint manifold diffusion for combining predictions on decoupled observations,Kwang In Kim (UNIST)*; Hyung Jin Chang (University of Birmingham),Statistical Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning; Vision Applications and\",Poster,2.2,82,D,,,\n3469,Audio-Visual Scene-Aware Dialog,\"Huda A Alamri (Georgia Institute of Technology )*; Vincent Cartillier (Georgia Tech); Abhishek Das (Georgia Tech); Jue Wang (Mitsubishi Electric Research Laboratories (MERL)); Anoop Cherian (MERL); Chiori Hori (Mitsubishi Electric Research Laboratories (MERL)); Tim K Marks (Mitsubishi Electric Research Laboratories, USA); Peter Anderson (Georgia Tech); Stefan Lee (Georgia Institute of Technology); Irfan Essa (Georgia Institute of Technology); Dhruv Batra (Georgia Tech & Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research)\",Datasets and Evaluation,Scene Analysis and Understanding; Vision + Language,Poster,2.2,83,D,,,\n3540,Learning to Minify Photometric Stereo,Junxuan Li (The Australian National University)*; Shaodi You (Data61-CSIRO); Yasuyuki Matsushita (Osaka University); Antonio Robles-Kelly (Deakin University),Physics-based Vision and Shape-from-X,Deep Learning,Poster,2.2,84,D,,,\n3702,Reflective and Fluorescent Separation under Narrow-Band Illumination,Koji Koyamatsu (Kyushu Institute of Technology); Daichi Hidaka (Kyushu Institute of Technology); Takahiro Okabe (Kyushu Institute of Technology)*; Hendrik P. A. Lensch (University of Tübingen),Physics-based Vision and Shape-from-X,Computational Photography,Poster,2.2,85,D,,,\n3760,Depth from a polarisation + RGB stereo pair,DIZHONG ZHU (University of York)*; William Smith (University of York),Physics-based Vision and Shape-from-X,3D from Multiview and Sensors; Computational Photography,Poster,2.2,86,D,,,\n4016,Rethinking the Evaluation of Video Summaries,\"Mayu Otani (CyberAgent, Inc.)*; Yuta Nakashima (Osaka University); Esa Rahtu (Tampere University of Technology); Janne Heikkila (University of Oulu, Finland)\",Datasets and Evaluation,Vision Applications and Systems,Poster,2.2,87,D,,,\n4153,What Object Should I Use? - Task Driven Object Detection,Johann Sawatzky (University of Bonn)*; Yaser Souri (University of Bonn); Christian Grund (University of Bonn); Jürgen Gall (University of Bonn),Datasets and Evaluation,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,88,D,,,\n3878,Triangulation Learning Network: from Monocular to Stereo 3D Object Detection,Zengyi Qin (Tsinghua University); Jinglu Wang (Microsoft Research Asia)*; Yan Lu (Microsoft Research Asia),3D from Multiview and Sensors,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,89,E,,,\n3965,Connecting the Dots: Learning Representations for Active Monocular Depth Estimation,Gernot Riegler (Intel Labs)*; Yiyi Liao (MPI Tuebingen); Simon Donné (MPI-IS); Vladlen Koltun (Intel Labs); Andreas Geiger (MPI-IS and University of Tuebingen),3D from Multiview and Sensors,3D from Single Image; Deep Learning,Poster,2.2,90,E,,,\n4006,Learning Non-Volumetric Depth Fusion using Successive Reprojections,Simon Donné (MPI-IS and University of Tübingen)*; Andreas Geiger (MPI-IS and University of Tuebingen),3D from Multiview and Sensors,Datasets and Evaluation; Deep Learning,Poster,2.2,91,E,,,\n4090,Stereo R-CNN based 3D Object Detection for Autonomous Driving,Peiliang LI (HKUST Robotics Institute)*; Xiaozhi Chen (DJI); Shaojie Shen (HKUST),3D from Multiview and Sensors,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Robotics + Driving\",Poster,2.2,92,E,,,\n4167,Hybrid Scene Compression for Visual Localization,Federico Camposeco (ETH Zurich); Andrea Cohen (Oculus); Marc Pollefeys (ETH Zurich / Microsoft); Torsten Sattler (Chalmers University of Technology)*,3D from Multiview and Sensors,\"Recognition: Detection, Categorization, Retrieval; Robotics + Driving\",Poster,2.2,93,E,,,\n3339,MMFace: A Multi-Metric Regression Network for Unconstrained Face Reconstruction,Hongwei Yi (Peking University)*; Chen Li (Tencent); Qiong Cao (Tencent); Xiaoyong Shen (Tencent); Sheng Li (Peking University); Guoping Wang (Peking University); Yu-Wing Tai (Tencent),3D from Single Image,\"Face, Gesture, and Body Pose\",Poster,2.2,94,F,,,\n3727,3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis,Xiaojuan Qi (University of Oxford)*; Zhengzhe Liu (DJI); Qifeng Chen (HKUST); Jiaya Jia (Chinese University of Hong Kong),RGBD sensors and analytics,Image and Video Synthesis,Poster,2.2,95,F,,,\n3742,Visual Localization by Learning Objects-of-Interest Dense Match Regression,Philippe Weinzaepfel (Naver Labs Europe)*; Gabriela Csurka (Naver Labs Europe); Yohann Cabon (Naver Labs Europe); Martin Humenberger (Naver Labs Europe),3D from Single Image,\"3D from Multiview and Sensors; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,96,F,,,\n3858,RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion,\"Jie Li (Nanjing University of Science and Technology)*; Yu Liu (The University of Adelaide); Dong Gong (The University of Adelaide); Qinfeng Shi (University of Adelaide); Xia Yuan (Nanjing University of Science and Technology); Chunxia Zhao (Nanjing University of Science and Technology); Ian Reid (\"\"University of Adelaide, Australia\"\")\",3D from Single Image,RGBD sensors and analytics,Poster,2.2,97,F,,,\n4159,Neural Scene Decomposition for Human Motion Capture,\"Helge Rhodin (EPFL)*; Victor Constantin (EPFL); Isinsu Katircioglu (EPFL); Mathieu Salzmann (EPFL); Pascal Fua (EPFL, Switzerland)\",3D from Single Image,\"Deep Learning ; Representation Learning; Scene Analysis and Understanding; Segmentation, Grouping an\",Poster,2.2,98,F,,,\n2676,Efficient Decision-based Black-box Adversarial Attacks on Face Recognition,Yinpeng Dong (Tsinghua University)*; Hang Su (Tsinghua Univiersity); Baoyuan Wu (Tencent AI Lab); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent); Tong Zhang (Tecent AI Lab); Jun Zhu (Tsinghua University),\"Face, Gesture, and Body Pose\",Deep Learning,Poster,2.2,99,G,,,\n3406,FA-RPN: Floating Region Proposals for Face Detection,Mahyar Najibi (University of Maryland)*; Bharat Singh (Amazon); Larry Davis (University of Maryland),\"Face, Gesture, and Body Pose\",\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,100,G,,,\n3416,Bayesian Hierarchical Dynamic Model for Human Action Recognition,Rui Zhao (RPI)*; Wanru Xu (Beijing Jiaotong University); Hui Su (IBM); Qiang Ji (Rensselaer Polytechnic Institute),\"Face, Gesture, and Body Pose\",Action Recognition ; Statistical Learning,Poster,2.2,101,G,,,\n3456,Mixed Effects Convolutional Neural Networks (MeNets) with Applications to Gaze Estimation,Yunyang Xiong (University of Wisconsin-Madison)*; Hyunwoo J Kim (Korea University); Vikas Singh (University of Wisconsin-Madison USA),\"Face, Gesture, and Body Pose\",Motion and Tracking,Poster,2.2,102,G,,,\n3495,3D human pose estimation in video with temporal convolutions and semi-supervised training,Dario Pavllo (Facebook)*; Christoph Feichtenhofer (Facebook AI Research); David Grangier (Google); Michael Auli (Facebook),\"Face, Gesture, and Body Pose\",,Poster,2.2,103,G,,,\n3603,Semantic Component Decomposition for Face Attribute Manipulation,Yingcong Chen (Chinese University of Hong Kong)*; Xiaohui Shen (ByteDance AI Lab); Zhe Lin (Adobe Research); Xin Lu (Adobe); I-Ming Pao (Adobe); Jiaya Jia (Chinese University of Hong Kong),\"Face, Gesture, and Body Pose\",Image and Video Synthesis; Vision Applications and Systems,Poster,2.2,104,G,,,\n3668,PoseFix: Model-agnostic General Human Pose Refinement Network,Gyeongsik Moon (Seoul National University); Ju Yong Chang (Kwangwoon University); Kyoung Mu Lee (Seoul National University)*,\"Face, Gesture, and Body Pose\",Deep Learning,Poster,2.2,105,G,,,\n3692,RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation,Bastian Wandt (Leibniz University Hannover)*; Bodo Rosenhahn (Leibniz University Hannover),\"Face, Gesture, and Body Pose\",3D from Single Image,Poster,2.2,106,G,,,\n3873,Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views,\"Junting Dong (Zhejiang University); Wen Jiang (Zhejiang University); Qixing Huang (The University of Texas at Austin); Hujun Bao (Zhejiang University); Xiaowei Zhou (Zhejiang Univ., China)*\",\"Face, Gesture, and Body Pose\",3D from Multiview and Sensors,Poster,2.2,107,G,,,\n3875,Face-Focused Cross-Stream Network for Deception Detection in Videos,Mingyu Ding (Renmin University of China); An Zhao (Renmin University of China); Zhiwu Lu (Renmin University of China)*; Tao Xiang (University of Surrey); Ji-Rong Wen (Renmin University of China),\"Face, Gesture, and Body Pose\",Video Analytics,Poster,2.2,108,G,,,\n3883,Unequal-training for deep face recognition with long-tailed noisy data,\"Yaoyao Zhong (Beijing University of Posts and Telecommunications)*; Mei Wang (Beijing University of Posts and Telecommunications); Jiani Hu (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications); Jianteng Peng (Canon Information Technology (Beijing) Co., Ltd); Xunqiang Tao (Canon Information Technology (Beijing) Co., Ltd); Yaohai Huang (Canon Information Technology (Beijing) Co., Ltd)\",\"Face, Gesture, and Body Pose\",Deep Learning ; Image and Video Synthesis; Representation Learning,Poster,2.2,109,G,,,\n4029,T-Net: High-Order Tensor FCN,Jean Kossaifi (Imperial College London)*; Adrian Bulat (University of Nottingham); Georgios Tzimiropoulos (University of Nottingham); Maja Pantic (Imperial College London / Samsung ),\"Face, Gesture, and Body Pose\",Deep Learning,Poster,2.2,110,G,,,\n4104,Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss,Lele Chen (University of Rochester)*; Ross Maddox (University of Rochester); Zhiyao Duan (Unversity of Rochester); Chenliang Xu (University of Rochester),\"Face, Gesture, and Body Pose\",Deep Learning ; Image and Video Synthesis; Video Analytics,Poster,2.2,111,G,,,\n3379,Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video,Radu Tudor Ionescu (University of Bucharest)*; Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Mariana-Iuliana Georgescu (University of Bucharest); Ling Shao (Inception Institute of Artificial Intelligence),Video Analytics,Action Recognition ; Deep Learning ; Motion and Tracking; Vision Applications and Systems,Poster,2.2,112,H,,,\n3391,DDLSTM: Dual-Domain LSTM for Cross-Dataset Action Recognition,Toby Perrett (University of Bristol); Dima Damen (University of Bristol)*,Action Recognition,Statistical Learning; Video Analytics,Poster,2.2,113,H,,,\n3398,The Pros and Cons: Rank-aware Temporal Attention for Skill Determination in Long Videos,Hazel Doughty (University of Bristol)*; Walterio Mayol-Cuevas (Bristol University); Dima Damen (University of Bristol),Video Analytics,\"Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,2.2,114,H,,,\n3544,Collaborative Spatiotemporal Feature Learning for Video Action Recognition,Chao Li (Hikvision Research Institute); Qiaoyong Zhong (Hikvision Research Institute); Di Xie (Hikvision Research Institute)*; Shiliang Pu (Hikvision Research Institute),Action Recognition,Deep Learning,Poster,2.2,115,H,,,\n3605,MARS: Motion-Augmented RGB Stream for Action Recognition,Nieves Crasto (INRIA)*; Philippe Weinzaepfel (Naver Labs Europe); Karteek Alahari (Inria); Cordelia Schmid (INRIA),Action Recognition,,Poster,2.2,116,H,,,\n3798,Convolutional Relational Machine for Group Activity Recognition,Sina Mokhtarzadeh Azar (Amirkabir University of Technology); Mina Ghadimi Atigh (Amirkabir University of Technology); Ahmad Nickabadi (Amirkabir University of Technology )*; Alexandre Alahi (EPFL),Action Recognition,Deep Learning,Poster,2.2,117,H,,,\n3885,Video Summarization by Learning from Unpaired Data,Mrigank Rochan (University of Manitoba)*; Yang Wang (University of Manitoba),Video Analytics,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,118,H,,,\n4047,Skeleton-Based Action Recognition with Directed Graph Neural Networks,\"Lei Shi (Institute of Automation，Chinese Academy of Sciences )*; Yifan o Zhang (Institute of Automation, Chinese Academy of Sciences); Jian Cheng (\"\"Chinese Academy of Sciences, China\"\"); Hanqing Lu (NLPR, Institute of Automation, CAS)\",Action Recognition,Deep Learning,Poster,2.2,119,H,,,\n4070,PA3D: Pose-Action 3D Machine for Video Recognition,\"An Yan (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Zhifeng Li (Tencent AI Lab); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*\",Action Recognition,,Poster,2.2,120,H,,,\n4117,Deep Dual Relation Modeling for Egocentric Interaction Recognition,\"Haoxin Li (Sun Yat-sen University)*; Yijun Cai (Sun Yat-sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)\",Action Recognition,\"Recognition: Detection, Categorization, Retrieval; Video Analytics\",Poster,2.2,121,H,,,\n3336,MOTS: Multi-Object Tracking and Segmentation,Paul Voigtlaender (RWTH Aachen University)*; Michael Krause (RWTH Aachen University); Aljosa Osep (RWTH Aachen University); Jonathon Luiten (RWTH Aachen University); Berin Balachandar Gnana Sekar (RWTH Aachen University); Andreas Geiger (MPI-IS and University of Tuebingen); Bastian Leibe (RWTH Aachen University-),Motion and Tracking,\"Deep Learning ; Segmentation, Grouping and Shape\",Poster,2.2,122,I,,,\n3821,Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking,Heng Fan (Temple University); Haibin Ling (Temple University)*,Motion and Tracking,Deep Learning ; Vision Applications and Systems,Poster,2.2,123,I,,,\n4050,PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds,Aseem Behl (MPI-IS and University of Tuebingen)*; Despoina Paschalidou (MPI-IS Tuebingen ); Simon Donné (MPI-IS and University of Tübingen); Andreas Geiger (MPI-IS and University of Tuebingen),Motion and Tracking,Deep Learning,Poster,2.2,124,I,,,\n3386,Listen to the Image,\"Di Hu (NWPU)*; Dong Wang (Northwestern Polytechnical University); Feiping Nie (Northwestern Polytechnical University); Qi Wang (Northwestern Polytechnical University); Xuelong Li (Northwestern Polytechnical University, China)\",Image and Video Synthesis,\"Computational Photography; Datasets and Evaluation; Image and Video Synthesis; Medical, Biological a\",Poster,2.2,125,J,,,\n3437,Image Super-Resolution by Neural Texture Transfer,Zhifei Zhang (University of Tennessee)*; Zhaowen Wang (Adobe Research); Zhe Lin (Adobe Research); Hairong Qi (University of Tennessee-Knoxville),Image and Video Synthesis,Deep Learning,Poster,2.2,126,J,,,\n3619,Conditional Adversarial Generative Flow for Controllable Image Synthesis,\"Rui Liu (Chinese University of Hong Kong)*; Yu Liu (The Chinese University of Hong Kong); Xinyu Gong (Texas A&M University ); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongsheng Li (Chinese University of Hong Kong)\",Image and Video Synthesis,\"Face, Gesture, and Body Pose ; Representation Learning\",Poster,2.2,127,J,,,\n3811,How to make a pizza: Learning a compositional layer-based GAN model,\"Dim P Papadopoulos (MIT)*; Youssef Tamaazousti (MIT); Ferda Ofli (Qatar Computing Research Institute, HBKU); Ingmar Weber (Qatar Computing Research Institute, Qatar); Antonio Torralba (MIT)\",Image and Video Synthesis,\"Recognition: Detection, Categorization, Retrieval\",Poster,2.2,128,J,,,\n3983,Geometry-Aware Unsupervised Image-to-Image Translation,Wayne Wu (SenseTime Research)*; Kaidi Cao (Stanford University); Cheng Li (SenseTime Research); Chen Qian (SenseTime); Chen Change Loy (Nanyang Technological University),Image and Video Synthesis,\"Deep Learning ; Face, Gesture, and Body Pose ; Representation Learning; Vision Applications and Syst\",Poster,2.2,129,J,,,\n1064,From One Photon to a Billion: High Flux Imaging with Single-Photon Sensors,\"Atul N Ingle (University of Wisconsin-Madison)*; Andreas Velten (University of Wisconsin - Madison); Mohit Gupta (\"\"University of Wisconsin-Madison, USA \"\")\",Computational Photography,Physics-based Vision and Shape-from-X,Oral,2.2,130,K,,,\n1624,Photon-Flooded Single-Photon 3D Cameras,\"Anant Gupta (University of Wisconsin Madison)*; Atul N Ingle (University of Wisconsin-Madison); Andreas Velten (University of Wisconsin - Madison); Mohit Gupta (\"\"University of Wisconsin-Madison, USA \"\")\",Computational Photography,Physics-based Vision and Shape-from-X,Oral,2.2,131,K,,,\n2059,Acoustic Non-Line-of-Sight Imaging,David Lindell (Stanford University)*; Gordon Wetzstein (Stanford University); Vladlen Koltun (Intel Labs),Computational Photography,,Oral,2.2,132,K,,,\n3310,Steady-state Non-Line-of-Sight Imaging,Wenzheng Chen (University of Toronto); Simon Daneau (Algolux)*; Colin Brosseau (Algolux); Felix Heide (Princeton University),Computational Photography,Low-level Vision; Physics-based Vision and Shape-from-X,Oral,2.2,133,K,,,\n2427,A Theory of Fermat Paths for Non-Line-of-Sight Shape Reconstruction,Shumian Xin (Carnegie Mellon University); Sotiris Nousias (University College London); Kyros Kutulakos (University of Toronto); Aswin Sankaranarayanan (Carnegie Mellon University); Srinivasa G Narasimhan (Carnegie Mellon University); Ioannis Gkioulekas (Carnegie Mellon University)*,Computational Photography,Physics-based Vision and Shape-from-X,Oral,2.2,134,K,,,\n474,End-to-end Projector Photometric Compensation,Bingyao Huang (Temple University)*; Haibin Ling (Temple University),Vision + Graphics,Computational Photography; Deep Learning ; Others,Oral,2.2,135,K,,,\n2605,Bringing a Blurry Frame Alive at High Frame-Rate with an Event Camera,\"Liyuan Pan (The Australian National University)*; cedric scheerlinck (The Australian National University); RICHARD HARTLEY (Australian National University, Australia); Miaomiao Liu (The Australian National University); Yuchao Dai (Northwestern Polytechnical University); Xin Yu (Australian National University)\",Computational Photography,Image and Video Synthesis; Vision Applications and Systems,Oral,2.2,136,K,,,\n5932,Bringing Alive Blurred Moments!,\"Kuldeep Purohit (Indian Institute of Technology Madras)*; Anshul Shah (University of Maryland, College Park); Rajagopalan N Ambasamudram (Indian Institute of Technology Madras)\",Computational Photography,Low-level Vision,Oral,2.2,137,K,,,\n1607,Learning to Synthesize Motion Blur,Tim Brooks (Google)*; Jonathan T Barron (Google Research),Computational Photography,Deep Learning ; Image and Video Synthesis; Motion and Tracking; Vision + Graphics,Oral,2.2,138,K,,,\n2861,Underexposed Photo Enhancement using Deep Illumination Estimation,\"Ruixing Wang (The Chinese University of Hong Kong); Qing Zhang ( Sun Yat-sen University); Chi-Wing Fu (The Chinese University of Hong Kong); Xiaoyong Shen (Tencent); WEI-SHI ZHENG (Sun Yat-sen University, China)*; Jiaya Jia (Chinese University of Hong Kong)\",Computational Photography,Deep Learning ; Low-level Vision,Oral,2.2,139,K,,,\n2843,Blind Visual Motif Removal from a Single Image,Amir Hertz (Tel Aviv University)*; Sharon Fogel (Tel-Aviv university); Rana Hanocka (TAU); Raja Giryes (Tel Aviv University); Danny Cohen-Or (Tel Aviv University),Vision + Graphics,Deep Learning,Oral,2.2,140,K,,,\n6541,Non-local Meets Global: An Integrated Paradigm for Hyperspectral Denoising,Wei He (RIKEN AIP)*; Quanming Yao (4Paradigm); Chao Li (RIKEN); Naoto Yokoya (RIKEN Center for Advanced Intelligence Project (AIP)); Qibin Zhao (RIKEN),Low-level Vision,Vision + Graphics,Oral,2.2,141,K,,,\n4943,Total Scene Capture: Neural Rerendering in the Wild,\"Moustafa Meshry (University of Maryland)*; Ricardo Martin-Brualla (Google); Noah Snavely (Cornell University and Google AI); Hugues Hoppe (Google Inc.); Sameh Khamis (Google); Rohit Pandey (Google); Dan B Goldman (Google, Inc.)\",Vision + Graphics,3D from Multiview and Sensors; Image and Video Synthesis,Oral,2.2,142,K,,,\n430,GeoNet: Deep Geodesic Networks for Point Cloud Analysis,Tong He (UCLA)*; Haibin Huang (Face++ (Megvii)); Li Yi (Stanford); Yuqian Zhou (UIUC); QIHAO WU (Face++ (Megvii)); jue wang (Face++ (Megvii)); Stefano Soatto (UCLA),Vision + Graphics,\"Deep Learning ; Representation Learning; Segmentation, Grouping and Shape\",Oral,2.2,143,K,,,\n2440,MeshAdv: Adversarial Meshes for Visual Recognition,\"CHAOWEI XIAO (University of Michigan, Ann Arbor); Dawei Yang (University of Michigan, Ann Arbor)*; Bo Li (University of Illinois at Urbana–Champaign); Jia Deng (Princeton University); mingyan liu (university of Michigan, Ann Arbor)\",Vision + Graphics,Deep Learning,Oral,2.2,144,K,,,\n4701,Fast Spatially-Varying Indoor Lighting Estimation,Mathieu Garon (Université Laval); Kalyan Sunkavalli (Adobe Research); Nathan Carr (Adobe); Sunil Hadap (Adobe); Jean-Francois Lalonde (Université Laval)*,Vision + Graphics,Computational Photography; Deep Learning,Oral,2.2,145,K,,,\n1188,Neural Illumination: Lighting Prediction for Indoor Environments,\"Shuran Song (Princeton)*; Thomas Funkhouser (Princeton University and Google, Inc.)\",Vision + Graphics,3D from Single Image; Scene Analysis and Understanding,Oral,2.2,146,K,,,\n4363,Deep Sky Modeling for Single Image Outdoor Lighting Estimation,Yannick Hold-Geoffroy (Adobe Research)*; Akshaya Athwale (Indian Institute of Technology Dhanbad); Jean-Francois Lalonde (Université Laval),Vision + Graphics,Computational Photography; Deep Learning ; Scene Analysis and Understanding,Oral,2.2,147,K,,,\n3399,Depth-attentional Features for Single-image Rain Removal,Xiaowei Hu (The Chinese University of Hong Kong)*; Chi-Wing Fu (The Chinese University of Hong Kong); Lei Zhu (The Chinese University of Hong Kong); Pheng-Ann Heng (The Chinese Univsersity of Hong Kong),Computational Photography,Deep Learning ; Low-level Vision; Vision Applications and Systems,Poster,2.2,148,K,,,\n3925,Hyperspectral Image Reconstruction using a Spectral Regularization Prior,Lizhi Wang (Beijing Institute of Technology); Chen Sun (Beijing Institute of Technology); Ying Fu (Beijing Institute of Technology); Min H. Kim (KAIST); Hua Huang (Beijing Institute of Technology)*,Computational Photography,Physics-based Vision and Shape-from-X,Poster,2.2,149,K,,,\n3926,LiFF: Light Field Features in Scale and Depth,Donald G Dansereau (University of Sydney)*; Bernd Girod (Stanford University); Gordon Wetzstein (Stanford University),Computational Photography,,Poster,2.2,150,K,,,\n4128,Deep Exemplar-based Video Colorization,Bo Zhang (Hong Kong University of Science and Technology)*; Mingming He (Hong Kong University of Science and Technology); Jing Liao (City University of Hong Kong); Pedro Sander (HKUST); Lu Yuan (Microsoft); Amine Bermak (Hong Kong University of Science and Technology); Dong Chen (Microsoft Research Asia),Computational Photography,Deep Learning,Poster,2.2,151,K,,,\n4157,On Finding Gray Pixel,Yanlin Qian (Tampere University of Technology)*; Joni-Kristian Kamarainen (Tampere University); jarno nikkanen (Intel Finland); Jiri Matas (CMP CTU FEE),Computational Photography,Low-level Vision; Statistical Learning,Poster,2.2,152,K,,,\n3486,UnOS: Unified Unsupervised Optical-flow and Stereo-depth Estimation by Watching Videos,Yang Wang (Baidu USA)*; Peng Wang (Baidu USA LLC.); Zhenheng Yang (Facebook Research); Chenxu Luo (Johns Hopkins University); Yi Yang (Baidu Research); Wei Xu (Baidu Research),Low-level Vision,Deep Learning ; Image and Video Synthesis; Motion and Tracking; Robotics + Driving,Poster,2.2,153,L,,,\n3497,Learning Transformation Synchronization,\"Xiangru Huang (University of Texas at Austin); Zhenxiao Liang (The University of Texas at Austin); Xiaowei Zhou (Zhejiang Univ., China); Yao Xie (Georgia Tech); Qixing Huang (The University of Texas at Austin)*; Leonidas Guibas (Stanford University)\",Optimization Methods,3D from Multiview and Sensors,Poster,2.2,154,L,,,\n3623,D2-Net: A Trainable CNN for Joint Description and Detection of Local Features,\"Mihai Dusmanu (ETH Zurich)*; Ignacio ROCCO (Inria); Tomas Pajdla (Czech Technical University in Prague); Marc Pollefeys (ETH Zurich / Microsoft); Josef Sivic (INRIA); Akihiko Torii (Tokyo Institute of Technology, Japan); Torsten Sattler (Chalmers University of Technology)\",Low-level Vision,\"3D from Multiview and Sensors; Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,155,L,,,\n3670,Recurrent Neural Networks with Intra-Frame Iterations for Video Deblurring,Seungjun Nah (Seoul National University); Sanghyun Son (Seoul National University); Kyoung Mu Lee (Seoul National University)*,Low-level Vision,Computational Photography; Deep Learning,Poster,2.2,156,L,,,\n3721,Learning to Extract Flawless Slow Motion from Blurry Videos,Meiguang Jin (University of Bern)*; Zhe Hu (Hikvision Research); Paolo Favaro (University of Bern),Low-level Vision,Image and Video Synthesis,Poster,2.2,157,L,,,\n3768,Natural and Realistic Single Image Super-Resolution with Explicit Natural Manifold Discrimination,Jae Woong Soh (Seoul National University)*; Gu Yong Park (Seoul National University); Junho Jo (Seoul National University); Nam Ik Cho (Seoul National University),Low-level Vision,Deep Learning,Poster,2.2,158,L,,,\n3854,RF-Net: An End-to-End Image Matching Network based on Receptive Field,Xuelun Shen (Xiamen University); Cheng Wang (Xiamen University)*; Zenglei Yu (XIamen University); Xin Li (Louisiana State University); Jonathan Li (Xiamen University); Chenglu Wen (Xiamen University); Ming Cheng (Xiamen University); Zijian He (Xiamen University),Low-level Vision,Deep Learning,Poster,2.2,159,L,,,\n3913,Fast Single Image Reflection Suppression via Convex Optimization,Yang Yang (Tencent AI Lab)*; Wenye Ma (Tencent); Yin Zheng (Tencent AI Lab); Jian-Feng Cai (Hong Kong University of Science and Technology); Weiyu Xu (University of Iowa),Low-level Vision,Computational Photography,Poster,2.2,160,L,,,\n3929,A Mutual Learning Method for Salient Object Detection with intertwined Multi-Supervision,Runmin Wu (Dalian University of Technology ); Mengyang Feng (Dalian University of Technology); Wenlong Guan (Dalian University of Technology); Dong Wang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology)*; Errui Ding (Baidu Inc.),Low-level Vision,\"Segmentation, Grouping and Shape\",Poster,2.2,161,L,,,\n3948,Enhanced Pix2pix Dehazing Network,Yanyun Qu (XMU)*; Yizi Chen (XMU); Jingying Huang (XMU); Yuan Xie (East China Normal University),Low-level Vision,,Poster,2.2,162,L,,,\n4014,Assessing Personally Perceived Image Quality via Image Features and Collaborative Filtering,Jari Korhonen (Shenzhen University)*,Low-level Vision,Statistical Learning; Vision Applications and Systems,Poster,2.2,163,L,,,\n4122,Single Image Reflection Removal Exploiting Misaligned Training Data and Network Enhancements,Kaixuan Wei (Beijing Institute of Technology); Jiaolong Yang (Microsoft Research Asia)*; Ying Fu (Beijing Institute of Technology); David Wipf (Microsoft Research Asia); Hua Huang (Beijing Institute of Technology),Low-level Vision,Computational Photography; Deep Learning,Poster,2.2,164,L,,,\n3375,Exploring Context and Visual Pattern of Relationship for Scene Graph Generation,\"Wenbin Wang (Institute of Computing Technology, Chinese Academy of Science); Ruiping Wang (ICT, CAS)*; Shiguang Shan (Chinese Academy of Sciences); Xilin Chen (China)\",Scene Analysis and Understanding,Deep Learning ; Visual Reasoning,Poster,2.2,165,M,,,\n3580,Learning from Synthetic Data for Crowd Counting in the Wild,\"Qi Wang (Northwestern Polytechnical University)*; Junyu Gao (Northwestern Polytechnical University, Center for OPTical IMagery Analysis and Learning); Wei Lin (Northwestern Polytechnical University, Center for OPTical IMagery Analysis and Learning); Yuan Yuan (Northwestern Polytechnical University)\",Scene Analysis and Understanding,Datasets and Evaluation; Image and Video Synthesis; Others; Representation Learning; Vision Applicat,Poster,2.2,166,M,,,\n3646,A Local Block Coordinate Descent Algorithm for the CSC Model,Ev Zisselman (Technion)*; Jeremias Sulam (jhu.edu); Michael Elad (Technion),Representation Learning,\"Image and Video Synthesis; Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,167,M,,,\n3689,Not Using the Car to See the Sidewalk: Quantifying and Controlling the Effects of Context in Classification and Segmentation,Rakshith Shetty (Max Planck Institute of Informatics)*; Bernt Schiele (MPI Informatics); Mario Fritz (CISPA Helmholtz Center for Information Security),Scene Analysis and Understanding,\"Datasets and Evaluation; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,168,M,,,\n3708,Discovering Interpretable Fair Representations,Novi Quadrianto (University of Sussex / HSE)*; Viktoriia Sharmanska (Imperial College London); Oliver Thomas (University of Sussex),Representation Learning,Image and Video Synthesis,Poster,2.2,169,M,,,\n3709,Actor-Critic Instance Segmentation,Nikita Araslanov (TU Darmstadt)*; Constantin Rothkopf (TU Darmstadt); Stefan Roth (TU Darmstadt),Scene Analysis and Understanding,\"Segmentation, Grouping and Shape\",Poster,2.2,170,M,,,\n3746,Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders,Edgar Schoenfeld (University of Amsterdam)*; Sayna Ebrahimi (UC Berkeley); Samrath Sinha (University of Toronto); Trevor Darrell (UC Berkeley); Zeynep Akata (University of Amsterdam),Representation Learning,Deep Learning,Poster,2.2,171,M,,,\n3747,SPNet: Semantic Projection Network for Zero-Label and Few-Label Semantic Segmentation,Yongqin Xian (Max Planck Institute Informatics)*; Subhabrata Choudhury (Max-Planck Institute for Informatics); Yang He (MPI Informatics); Bernt Schiele (MPI Informatics); Zeynep Akata (University of Amsterdam),Scene Analysis and Understanding,\"Segmentation, Grouping and Shape; Vision + Language\",Poster,2.2,172,M,,,\n3927,GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation,\"Xinhong Ma (National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences;University of Chinese Academy of Sciences)*; Tianzhu Zhang (CAS, China); Changsheng Xu (CASIA)\",Representation Learning,Vision Applications and Systems,Poster,2.2,173,M,,,\n4003,Seamless Scene Segmentation,Lorenzo Porzi (Mapillary Research); Samuel Rota Bulò (Mapillary Research); Aleksander Colovic (Mapillary Research); Peter Kontschieder (Mapillary Research)*,Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Poster,2.2,174,M,,,\n4089,Unsupervised Image Matching and Object Discovery as Optimization,Huy V. Vo (Ecole Normale Supérieure - INRIA - Valeo.ai)*; Jean Ponce (Inria); Patrick Pérez (Valeo.ai); Francis Bach (INRIA - Ecole Normale Supérieure); Yann LeCun (New York University); Minsu Cho (POSTECH); Kai Han (University of Oxford),Scene Analysis and Understanding,Optimization Methods,Poster,2.2,175,M,,,\n4178,Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs,\"Qi ZHANG (City University of Hong Kong, Hong Kong); Antoni Chan (City University of Hong Kong, Hong, Kong)*\",Scene Analysis and Understanding,\"Motion and Tracking; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and S\",Poster,2.2,176,M,,,\n12,Grounded Video Description,Luowei Zhou (University of Michigan)*; Yannis Kalantidis (Facebook Research); Xinlei Chen (Facebook AI Research); Jason J Corso (University of Michigan); Marcus Rohrbach (Facebook AI Research),Vision + Language,,Oral,2.2,177,N,,,\n3566,Streamlined Dense Video Captioning,Jonghwan Mun (POSTECH)*; Linjie Yang (ByteDance AI Lab); Zhou Ren (Snap Inc.); Ning Xu (Snap); Bohyung Han (Seoul National University),Vision + Language,Deep Learning ; Video Analytics,Oral,2.2,178,N,,,\n5612,Adversarial Inference for Multi-Sentence Video Description,Jae Sung Park (UC Berkeley); Marcus Rohrbach (Facebook AI Research); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley)*,Vision + Language,,Oral,2.2,179,N,,,\n4705,Unified Visual-Semantic Emebddings: Bridging Vision and Language with Structured Meaning Representations,\"Hao Wu (Fudan University)*; Jiayuan Mao (Tsinghua University); Yufeng Zhang (Fudan University); Weiwei Sun (\"\" Fudan University, China\"\"); Yuning Jiang (Bytedance); Lei Li (ByteDance AI Lab); Weiying Ma (Bytedance)\",Vision + Language,Representation Learning,Oral,2.2,180,N,,,\n3640,Learning to Compose Dynamic Tree Structures for Visual Contexts,Kaihua Tang (Nanyang Technological University)*; Hanwang Zhang (Nanyang Technological University); Baoyuan Wu (Tencent AI Lab); Wenhan Luo (Tencent AI Lab); Wei Liu (Tencent),Vision + Language,Scene Analysis and Understanding; Visual Reasoning,Oral,2.2,181,N,,,\n5104,Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation,\"Xin Wang (University of California, Santa Barbara)*; Qiuyuan Huang (Microsoft Research AI); Asli Celikyilmaz (Microsoft Research AI); Jianfeng Gao (Microsoft Research); Dinghan Shen (Duke University); Yuan-Fang Wang (UC Santa Barbara); William Yang Wang (UC Santa Barbara); Lei Zhang (Microsoft Research)\",Vision + Language,Robotics + Driving; Vision Applications and Systems; Visual Reasoning,Oral,2.2,182,N,,,\n1824,Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering,\"gao peng (Chinese university of hong kong)*; Hongsheng Li (Chinese University of Hong Kong); Haoxuan You (Tsinghua University); Zhengkai Jiang (Institute of Automation，Chinese Academy of Sciences); Pan Lu (Tsinghua University); Steven Hoi (SMU); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)\",Vision + Language,Vision Applications and Systems; Visual Reasoning,Oral,2.2,183,N,,,\n3454,Cycle-Consistency for Robust Visual Question Answering,Meet Shah (Facebook AI Research)*; Xinlei Chen (Facebook AI Research); Marcus Rohrbach (Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research),Vision + Language,,Oral,2.2,184,N,,,\n135,Embodied Question Answering in Photorealistic Environments with Point Cloud Perception,Erik Wijmans (Georgia Tech)*; Samyak Datta (Georgia Tech); Oleksandr Maksymets (Facebook AI Research); Abhishek Das (Georgia Tech); Georgia Gkioxari (Facebook); Stefan Lee (Georgia Institute of Technology); Irfan Essa (Georgia Institute of Technology); Dhruv Batra (Georgia Tech & Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research),Vision + Language,,Oral,2.2,185,N,,,\n3909,Reasoning Visual Dialogs with Structural and Partial Observations,Zilong Zheng (UCLA); Wenguan Wang (Inception Institute of Artificial Intelligence)*; Siyuan Qi (UCLA); Song-Chun Zhu (UCLA),Vision + Language,,Oral,2.2,186,N,,,\n3129,Recursive Visual Attention in Visual Dialog,Yulei Niu (Renmin University of China); Manli Zhang (Renmin University of China); Jianhong Zhang (Renmin University of China); Zhiwu Lu (Renmin University of China)*; Ji-Rong Wen (Renmin University of China); Hanwang Zhang (Nanyang Technological University),Vision + Language,Visual Reasoning,Oral,2.2,187,N,,,\n3820,Two Body Problem: Collaborative Visual Task Completion,\"Unnat Jain (UIUC)*; Luca Weihs (Allen Institute for Artificial Intelligence); Eric Kolve (Allen AI); Mohammad Rastegari (Allen Institute for Artificial Intelligence); Svetlana Lazebnik (UIUC); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Alexander Schwing (UIUC); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence)\",Visual Reasoning,\"Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding\",Oral,2.2,188,N,,,\n7021,GQA: a new dataset for compositional question answering over real-world images,Drew A Hudson (Stanford University)*; Chris Manning (Stanford),Visual Reasoning,Datasets and Evaluation; Deep Learning ; Scene Analysis and Understanding; Vision + Language,Oral,2.2,189,N,,,\n1530,Text2Scene: Generating Compositional Scenes from Textual Descriptions,Fuwen Tan (University of Virginia)*; Song Feng (IBM Research); Vicente Ordonez (University of Virginia),Vision + Language,Image and Video Synthesis,Oral,2.2,190,N,,,\n5126,From Recognition to Cognition: Visual Commonsense Reasoning,\"Rowan Zellers (University of Washington)*; Yonatan Bisk (University of Washington); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Yejin Choi (University of Washington)\",Vision + Language,\"Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding; Visual Reasonin\",Oral,2.2,191,N,,,\n3587,The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation,Chih-Yao Ma (Georgia Institute of Technology)*; Zuxuan Wu (UMD); Ghassan AlRegib (Georgia Institute of Technology �); Caiming Xiong (Salesforce Research); Zsolt Kira (Georgia Institute of Technology),Vision + Language,Visual Reasoning,Oral,2.2,192,N,,,\n6287,Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation,Liyiming Ke (University of Washington); Xiujun Li (Microsoft Research)*; Yonatan Bisk (University of Washington); Ari Holtzman (University of Washington); Zhe Gan (Microsoft); Jingjing Liu (Microsoft); Jianfeng Gao (Microsoft Research); Yejin Choi (University of Washington); Siddhartha Srinivasa (University of Washington),Vision + Language,Robotics + Driving,Oral,2.2,193,N,,,\n1770,Learning to Learn How to Learn: Self-Adaptive Visual Navigation using Meta-Learning,\"Mitchell N Wortsman (Allen Institute for Artificial Intelligence); Kiana Ehsani (University of Washington); Mohammad Rastegari (Allen Institute for Artificial Intelligence); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence); Roozbeh Mottaghi (Allen Institute for AI)*\",Visual Reasoning,Scene Analysis and Understanding; Statistical Learning,Oral,2.2,194,N,,,\n3422,Self-critical n-step Training for Image Captioning,\"Junlong Gao (Peking University Shenzhen Graduate School)*; Shiqi Wang (CityU); Shanshe Wang (Peking University); Siwei Ma (Peking University, China); Wen Gao (PKU)\",Vision + Language,Scene Analysis and Understanding,Poster,2.2,195,N,,,\n3450,Towards VQA models that can read,Amanpreet Singh (Facebook)*; Vivek Natarajan (.); Meet Shah (Facebook AI Research); Yu Jiang (Facebook AI Research); Xinlei Chen (Facebook AI Research); Dhruv Batra (Georgia Tech & Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research); Marcus Rohrbach (Facebook AI Research),Vision + Language,,Poster,2.2,196,N,,,\n3595,Object-aware Aggregation with Bidirectional Temporal Graph for Video Captioning,Junchao Zhang (Peking University); Yuxin Peng (Peking University)*,Vision + Language,Video Analytics,Poster,2.2,197,N,,,\n3652,Progressive Attention Memory Network for Movie Story Question Answering,\"Junyeong Kim (KAIST)*; Minuk Ma (KAIST); Kyungsu Kim (samsung research); Sungjin Kim (SAMSUNG ELECTRONICS CO.,LTD); Chang D. Yoo (KAIST)\",Vision + Language,Vision Applications and Systems; Visual Reasoning,Poster,2.2,198,N,,,\n3659,Memory-Attended Recurrent Network for Video Captioning,Wenjie Pei (Tencent)*; Jiyuan Zhang (Tencent YouTu); Xiangrong Wang (Delft University of Technology); Lei Ke (Tencent); Xiaoyong Shen (Tencent); Yu-Wing Tai (Tencent),Vision + Language,Deep Learning ; Video Analytics,Poster,2.2,199,N,,,\n3846,Visual Query Answering by Entity-attribute Graph Matching and Reasoning,Peixi Xiong (Northwestern University)*; HUAYI ZHAN (NORTHWESTERN UNIVERSITY); Xin Wang (Southwest Jiaotong University); Baivab Sinha (Sichuan Changhong Electric Co. Ltd); Ying Wu (Northwestern University),Vision + Language,Datasets and Evaluation; Vision + Graphics,Poster,2.2,200,N,,,\n3848,Look Back and Predict Forward in Image Captioning,Yu Qin (Shanghai Jiao Tong University)*; Jiajun Du (Shanghai Jiao Tong University); Hongtao Lu (Shanghai Jiao Tong University); Yonghua Zhang (Bytedance),Vision + Language,Deep Learning ; Image and Video Synthesis,Poster,2.2,201,N,,,\n3908,Explainable and Explicit Visual Reasoning over Scene Graphs,Jiaxin Shi (Tsinghua University)*; Hanwang Zhang (Nanyang Technological University); Juanzi Li (Tsinghua University),Visual Reasoning,Vision + Language,Poster,2.2,202,N,,,\n4024,Transfer Learning via Unsupervised Task Discovery for Visual Question Answering,Hyeonwoo Noh (POSTECH)*; Taehoon Kim (DEVSISTERS); Jonghwan Mun (POSTECH); Bohyung Han (Seoul National University),Vision + Language,Deep Learning ; Representation Learning,Poster,2.2,203,N,,,\n4115,Intention Oriented Image Captions with Guiding Objects,Yue Zheng (Tsinghua University); Ya-Li Li (THU); Shengjin Wang (Tsinghua University)*,Vision + Language,Deep Learning,Poster,2.2,204,N,,,\n3404,Uncertainty Guided Multi-Scale Residual Learning-using a Cycle Spinning CNN for Single Image De-Raining,\"Rajeev Yasarla ( Johns Hopkins University, Whiting School of Engineering)*; Vishal Patel (Johns Hopkins University)\",Vision Applications and Systems,Image and Video Synthesis; Low-level Vision,Poster,2.2,205,O,,,\n3467,Toward Realistic Image Composition with Adversarial Training,Bor-Chun Chen (University of Maryland)*; Andrew Kae (Oath),Vision Applications and Systems,\"Computational Photography; Deep Learning ; Image and Video Synthesis; Segmentation, Grouping and Sha\",Poster,2.2,206,O,,,\n3468,Cross-Classification Clustering: An Efficient Multi-Object Tracking Technique for 3-D Instance Segmentation in Connectomics,Yaron Meirovitch (MIT CSAIL)*; Lu Mi (MIT); Hayk Saribekyan (MIT); Alexander Matveev (MIT); David Rolnick (MIT); Nir Shavit (Massachusetts Institute of Technology),\"Medical, Biological and Cell Microscopy\",\"Deep Learning ; Motion and Tracking; Recognition: Detection, Categorization, Retrieval; Segmentation\",Poster,2.2,207,O,,,\n3473,Deep ChArUco: Dark ChArUco Marker Pose Estimation,Tomasz Malisiewicz (MIT)*; Daniel DeTone (Magic Leap); Danying Hu (Magic Leap),Vision Applications and Systems,Motion and Tracking; Robotics + Driving,Poster,2.2,208,O,,,\n3482,Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving,Yan Wang (Cornell University)*; Wei-Lun Chao (Cornell University); Divyansh Garg (Cornell University); Bharath Hariharan (Cornell University); Mark Campbell (Cornell University); Kilian Weinberger (Cornell University),Robotics + Driving,\"3D from Multiview and Sensors; Recognition: Detection, Categorization, Retrieval\",Poster,2.2,209,O,,,\n3533,Rules of the Road: Predicting Driving Behavior with a Convolutional Model of Semantic Interactions,Benjamin Sapp (Google)*; Joey Hong (Caltech); James Philbin (Zoox),Robotics + Driving,Deep Learning ; Motion and Tracking; Vision Applications and Systems,Poster,2.2,210,O,,,\n3614,Metric Learning for Image Registration,\"Marc Niethammer (UNC)*; Roland Kwitt (\"\"University of Salzburg, Austria\"\"); Francois-Xavier Vialard (University Paris-Est)\",\"Medical, Biological and Cell Microscopy\",Deep Learning ; Motion and Tracking,Poster,2.2,211,O,,,\n3654,LO-Net: Deep Real-time Lidar Odometry,Qing Li (Xiamen University); Shaoyang Chen (Xiamen University); Cheng Wang (Xiamen University)*; Xin Li (Louisiana State University); Chenglu Wen (Xiamen University); Ming Cheng (Xiamen University); Jonathan Li (Xiamen University),Robotics + Driving,Vision Applications and Systems,Poster,2.2,212,O,,,\n3841,TraPHic: Trajectory Prediction in Dense and Heterogeneous Traffic Using Weighted Interactions,\"Rohan Chandra (University of Maryland)*; Uttaran Bhattacharya (University of Maryland, College Park); Aniket Bera (The University of North Carolina at Chapel Hill); Dinesh Manocha (UMD)\",Robotics + Driving,Vision Applications and Systems,Poster,2.2,213,O,,,\n3912,World from Blur,Jiayan Qiu (University of Sydney)*; Xinchao Wang (Stevens Institute of Technology); Stephen J Maybank (Birkbeck College); Dacheng Tao (University of Sydney),Vision Applications and Systems,Deep Learning,Poster,2.2,214,O,,,\n3941,Topology Reconstruction of Tree-like Structure in Images via Structural Similarity Measure and Dominant Set Clustering,\"Jianyang Xie (Cixi Institute of Biomedical Engineering, Ningbo Institute of Industrial Technology, Chinese Academy of Sciences); Yitian Zhao (Cixi Institute of Biomedical Engineering, Ningbo Institute of Industrial Technology, Chinese Academy of Sciences)*; Yonghuai Liu (Edge Hill University); Su Pan (Cixi Institute of Biomedical Engineering, Ningbo Institute of Industrial Technology, Chinese Academy of Sciences); Yifan Zhao (School of Aerospace, Transport and Manufacturing, Cranfield University); Jun Cheng (Cixi Institute of Biomedical Engineering, Chinese Academy of Sciences); Yalin Zheng (University of Liverpool); Jiang Liu (\"\"Chinese Academy of Sciences, China\"\")\",\"Medical, Biological and Cell Microscopy\",\"Image and Video Synthesis; Low-level Vision; Recognition: Detection, Categorization, Retrieval; Segm\",Poster,2.2,215,O,,,\n4043,Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training,\"Feng Zheng (Southern University of Science and Technology)*; Rongrong Ji (Xiamen University, China); Cheng Deng (Xidian University); Xing Sun (Tencent); Xinyang Jiang (Tencent); Xiaowei Guo (Tencent Youtu Lab); Zongqiao Yu (Tencent); Feiyue Huang (Tencent)\",Vision Applications and Systems,\"Deep Learning ; Motion and Tracking; Recognition: Detection, Categorization, Retrieval; Representati\",Poster,2.2,216,O,,,\n4200,A Flexible Convolutional Solver for Fast Style Transfers,Gilles Puy (Technicolor)*; Patrick Pérez (Valeo.ai),Deep Learning,Image and Video Synthesis,Poster,3.1,1,A,,,\n4258,Cross Domain Model Compression by Structured Weight Sharing,Shangqian Gao (University of Pittsburgh)*; Cheng Deng (Xidian University); Heng Huang (University of Pittsburgh),Deep Learning,Action Recognition ; Others; RGBD sensors and analytics,Poster,3.1,2,A,,,\n4331,TraVeLGAN: Image-to-image Translation by Transformation Vector Learning,Matthew Amodio (Yale University)*; Smita Krishnaswamy (),Deep Learning,Image and Video Synthesis; Others,Poster,3.1,3,A,,,\n4381,Deep Robust Subjective Visual Property Prediction in Crowdsourcing,\"Qianqian Xu (Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences); Zhiyong Yang (SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences; SCS, University of Chinese Academy of Sciences); Yangbangyan Jiang (SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences; SCS, University of Chinese Academy of Sciences); Xiaochun Cao (Chinese Academy of Sciences); Qingming Huang (University of Chinese Academy of Sciences)*; Yuan Yao (HongKong University of Science and Technology)\",Deep Learning,Others,Poster,3.1,4,A,,,\n4421,Transferable AutoML by Model Sharing over Grouped Datasets,Xue Chao (IBM Research)*; Junchi Yan (Shanghai Jiao Tong University); Rong Yan (IBM ); Stephen M Chu (IBM Research - China); Yonghua Lin (IBM); Yonggang Hu (IBM),Deep Learning,Datasets and Evaluation,Poster,3.1,5,A,,,\n4452,Learning Not to Learn: Training Deep Neural Networks with Biased Data,\"Byungju Kim (KAIST)*; Hyunwoo Kim (Beijing Institute of Technology); Kyungsu Kim (samsung research); Sungjin Kim (SAMSUNG ELECTRONICS CO.,LTD); Junmo Kim (KAIST)\",Deep Learning,\"Face, Gesture, and Body Pose ; Optimization Methods; Recognition: Detection, Categorization, Retriev\",Poster,3.1,6,A,,,\n4472,IRLAS: Inverse Reinforcement Learning for Architecture Search,Minghao Guo (Sensetime)*; Zhao Zhong (CASIA); Wei Wu (SenseTime Group Limited); Dahua Lin (The Chinese University of Hong Kong); Junjie Yan (Sensetime Group Limited),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,7,A,,,\n4526,Learning for Single-Shot Confidence Calibration in Deep Neural Networks through Stochastic Inferences,Seonguk Seo (Seoul National University)*; Paul Hongsuck Seo (POSTECH); Bohyung Han (Seoul National University),Deep Learning,,Poster,3.1,8,A,,,\n4562,Attention-based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions,Masanori Suganuma (RIKEN AIP / Tohoku University)*; Xing Liu (Tohoku university); Takayuki Okatani (Tohoku University/RIKEN AIP),Deep Learning,Computational Photography; Image and Video Synthesis; Low-level Vision; Vision Applications and Syst,Poster,3.1,9,A,,,\n4666,Fully Learnable Group Convolution for Acceleration of Deep Neural Networks,\"Xijun Wang (Institute of Computing Technology, Chinese Academy of Sciences); Meina Kan (Institute of Computing Technology, Chinese Academy of Sciences)*; Shiguang Shan (Chinese Academy of Sciences); Xilin Chen (China)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,10,A,,,\n4692,EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching from Scratch,Jian Ren (Rutgers University)*; Zhe Li (The University of Iowa ); Jianchao Yang (Snapchat); Ning Xu (Snap Inc); Tianbao Yang (University of Iowa); David Foran (Rutgers Cancer Institute of New Jersey),Deep Learning,,Poster,3.1,11,A,,,\n4727,Deep Incremental Hashing Network for Efficient Image Retrieval,\"Dayan Wu (Institute of Information Engineering, Chinese Academy of Sciences)*; Qi Dai (Microsoft Research); Jing Liu (Institute of Information Engineering, Chinese Academy of Sciences); Bo Li ( Institute of Information Engineering, Chinese Academy of Sciences); Weiping Wang (Institute of Information Engineering, CAS, China)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,12,A,,,\n4753,\"Robustness via curvature regularization, and vice versa\",Seyed-Mohsen Moosavi-Dezfooli (EPFL)*; Alhussein Fawzi (Google Deepmind); Jonathan Uesato (DeepMind); Pascal Frossard (EPFL),Deep Learning,,Poster,3.1,13,A,,,\n4754,SparseFool: a few pixels make a big difference,Apostolos Modas (EPFL)*; Seyed-Mohsen Moosavi-Dezfooli (EPFL); Pascal Frossard (EPFL),Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,3.1,14,A,,,\n4904,Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks,Jörg Wagner (Bosch Center for Artificial Intelligence)*; Jan Mathias Koehler (Bosch Center for Artificial Intelligence); Tobias Gindele (Robert Bosch GmbH); Leon Hetzel (Bosch Center for Artificial Intelligence); Jakob Thaddaeus Wiedemer (Bosch Center for Artificial Intelligence); Sven Prof. Behnke (University of Bonn),Deep Learning,\"Others; Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems; Visual R\",Poster,3.1,15,A,,,\n4951,Structured Pruning of Neural Networks with Budget-Aware Regularization,\"Carl Lemaire (Universite de Sherbrooke)*; Andrew Achkar (Miovision Technologies Inc., Canada); Pierre-Marc Jodoin (Universite de Sherbrooke)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,3.1,16,A,,,\n5033,MBS: Macroblock Scaling for CNN Model Reduction,Yu-Hsun Lin (CTBC Bank)*; Chun-Nan Chou (HTC Research); Edward Chang (),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,17,A,,,\n5072,Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells,\"Vladimir Nekrasov (University of Adelaide / ACRV)*; Hao Chen (The University of Adelaide); Chunhua Shen (University of Adelaide); Ian Reid (\"\"University of Adelaide, Australia\"\")\",Deep Learning,\"Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,3.1,18,A,,,\n5078,Generating 3D Adversarial Point Clouds,Chong Xiang (Shanghai Jiao Tong University)*; Charles R. Qi (Facebook AI Research); Bo Li (University of Illinois at Urbana–Champaign),Deep Learning,3D from Multiview and Sensors; Optimization Methods; Representation Learning; Statistical Learning,Poster,3.1,19,A,,,\n5163,Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search,Xin Li (UISEE Company)*; Yiming Zhou (University of Electrical Science and Technology of China); Zheng Pan (UISEE Company); Jiashi Feng (NUS),Deep Learning,\"Segmentation, Grouping and Shape\",Poster,3.1,20,A,,,\n5165,Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics,\"Jianjin Zhang (Tsinghua University); Yunbo Wang (Tsinghua University); Hongyu Zhu (Tsinghua University); Mingsheng Long (Tsinghua University)*; Jianmin Wang (\"\"Tsinghua University, China\"\"); Philip S Yu (UIC)\",Deep Learning,Video Analytics,Poster,3.1,21,A,,,\n5177,Variational Information Distillation for Knowledge Transfer,Sungsoo Ahn (KAIST)*; Shell Hu (ENPS); Andreas Damianou (Amazon); Neil Lawrence (Amazon); Zhenwen Dai (Amazon),Deep Learning,Statistical Learning,Poster,3.1,22,A,,,\n5193,GaterNet: Dynamic Filter Selection in Convolutional Neural Network via a Dedicated Global Gating Network,Zhourong Chen (The Hong Kong University of Science and Technology)*; Yang Li (Google Research); Samy Bengio (Google Brain); Si Si (Google Research),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,23,A,,,\n5194,SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360◦ Images,Kuk-Jin Yoon (KAIST)*; Yeonkun Lee (KAIST); Jaeseok Jeong (KAIST); Jongseob Yun (KAIST); Wonjune Cho (KAIST),Deep Learning,Deep Learning,Poster,3.1,24,A,,,\n5206,\"ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network\",Sachin Mehta (University of Washington)*; Mohammad Rastegari (Allen Institute for Artificial Intelligence); Linda Shapiro (University of Washington); Hannaneh Hajishirzi (University of Washington),Deep Learning,\"Others; Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,3.1,25,A,,,\n5208,Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors,Mohammad Mahdi Derakhshani (University of Tehran); saeed masoudnia (university of tehran); Amirhossein Shaker (university of Tehran); Omid Mersa (University of Tehran); Mohammad Amin Sadeghi (University of Tehran); Mohammad Rastegari (Allen Institute for Artificial Intelligence)*; Babak Nadjar Araabi (Tehran University),Deep Learning,\"Computer Vision Theory; Datasets and Evaluation; Recognition: Detection, Categorization, Retrieval;\",Poster,3.1,26,A,,,\n5245,Exploiting Edge Features in Graph Neural Networks,Liyu Gong (University of Kentucky)*; Qiang Cheng (University of Kentucky),Deep Learning,Others,Poster,3.1,27,A,,,\n5286,Propagation Mechanism for Deep and Wide Neural Networks,Dejiang Xu (National University of Singapore); Mong Li Lee (National University of Singapore)*; Wynne Hsu (National University of Singapore),Deep Learning,,Poster,3.1,28,A,,,\n5348,\"Catastrophic Child's Play: Easy to Perform, Hard to Defend Adversarial Attacks\",\"Chih-Hui Ho (University of California San Diego)*; Erik Sandström (Lund University); Brandon Leung (University of California, San Diego); Yen Chang (University of California, San Diego); Nuno Vasconcelos (UC San Diego)\",Deep Learning,Others,Poster,3.1,29,A,,,\n5353,Embedding Complementary Deep Networks for Image Classification,Qiuyu Chen (University of North Carolina at Charlotte)*; Wei Zhang (Fudan University); Jun Yu (HDU); Jianping Fan (UNCC),Deep Learning,Computer Vision Theory,Poster,3.1,30,A,,,\n3380,Deep Multimodal Clustering for Unsupervised Audiovisual Learning,\"Di Hu (NWPU)*; Feiping Nie (Northwestern Polytechnical University); Xuelong Li (Northwestern Polytechnical University, China)\",\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods ; Others; Representation Learning; Scene Analysis and Understanding; V\",Poster,3.1,31,B,,,\n4215,Dense Classification and Implanting for Few-shot Learning,Yann R Lifchitz (Safran)*; Yannis Avrithis (Inria); Sylvaine Picard (Safran); Andrei Bursuc (Valeo),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.1,32,B,,,\n4223,Class-Balanced Loss Based on Effective Number of Samples,Yin Cui (Cornell University)*; Tsung-Yi Lin (Google Brain); Menglin Jia (Cornell University); Yang Song (Google); Serge Belongie (Cornell University),\"Recognition: Detection, Categorization, Retrieval\",Computer Vision Theory; Deep Learning,Poster,3.1,33,B,,,\n4295,Discovering Visual Patterns in Art Collections with Spatially-consistent Feature Learning,Xi Shen (École des Ponts ParisTech)*; Alexei A Efros (UC Berkeley); Mathieu Aubry (École des ponts ParisTech),\"Recognition: Detection, Categorization, Retrieval\",\"Deep Learning ; Representation Learning; Segmentation, Grouping and Shape; Vision Applications and S\",Poster,3.1,34,B,,,\n4345,Min-Max Statistical Alignment for Transfer Learning,\"Samitha Herath (The Australian National University)*; Basura Fernando (Agency for Science, Technology and Research, A*STAR, Singapore); Mehrtash Harandi (Monash University); Richard Nock (Data61-CSIRO)\",\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Poster,3.1,35,B,,,\n4388,Spatial-aware Graph Relation Network for Large-scale Object Detection,Hang Xu (Huawei Noah's Ark Lab); ChenHan Jiang (Sun Yat-sen University); Xiaodan Liang (Sun Yat-sen University)*; Zhenguo Li (Huawei Noah's Ark Lab),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Visual Reasoning,Poster,3.1,36,B,,,\n4390,\"Deformable ConvNets v2: More Deformable, Better Results\",Xizhou Zhu (University of Science and Technology of China); Han Hu (Microsoft Research Asia); Stephen Lin (Microsoft Research); Jifeng Dai (Microsoft Research Asia)*,\"Recognition: Detection, Categorization, Retrieval\",,Poster,3.1,37,B,,,\n4396,Interaction-and-Aggregation Network for Person Re-identification,Ruibing Hou (Institute of Computing Technology，Chinese Academy); Bingpeng MA (UCAS)*; Hong Chang (Chinese Academy of Sciences); Xinqian Gu (University of Chinese Academy of Sciences); Shiguang Shan (Chinese Academy of Sciences); Xilin Chen (China),\"Recognition: Detection, Categorization, Retrieval\",,Poster,3.1,38,B,,,\n4406,Rare Event Detection using Disentangled Representation Learning,Ryuhei Hamaguchi (National Institute of Advanced Industrial Science and Technology)*; Ken Sakurada (National Institute of Advanced Industrial Science and Technology); Ryosuke Nakamura (National Institute of Advanced Industrial Science and Technology),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning; Scene Analysis and Understanding; Vision Applications and S,Poster,3.1,39,B,,,\n4610,Shape Robust Text Detection with Progressive Scale Expansion Network,Wenhai Wang (Nanjing university); Xiang Li (NJUST); Enze Xie (Tongji University); Wenbo Hou (Nanjing University); Tong Lu (Nanjing University)*; Gang Yu (Face++); Shuai Shao (Megvii (Face++)),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.1,40,B,,,\n4657,Dual Dense Encoding for Zero-Example Video Retrieval,Jianfeng Dong (Zhejiang Gongshang University); Xirong Li (Renmin University of China)*; Chaoxi Xu (Renmin University of China); Shouling Ji (Zhejiang University); Yuan He (Alibaba Group ); Gang Yang (Renmin University of China); Xun Wang (Zhejiang Gongshang University),\"Recognition: Detection, Categorization, Retrieval\",Vision + Language,Poster,3.1,41,B,,,\n4678,MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors,\"Lile Cai (Institute for Infocomm Research)*; Bin Zhao (Nil); Zhe Wang (I2R); Jie Lin (Institute for Infocomm Research (I2R), Singapore); Chuan Sheng Foo (Institute for Infocomm Research, A*STAR); Mohamed Sabry Aly (Nanyang Technological University); Vijay Chandrasekhar (Institute for Infocomm Research)\",\"Recognition: Detection, Categorization, Retrieval\",Robotics + Driving; Vision Applications and Systems,Poster,3.1,42,B,,,\n4706,Character Region Awareness for Text Detection,\"Youngmin Baek ( Clova AI Research, NAVER Corp.)*; Bado Lee (Clova AI Research, NAVER Corp.); Dongyoon Han (Clova AI Research, NAVER Corp.); Sangdoo Yun ( Clova AI Research, NAVER Corp.); Hwalsuk Lee ( Clova AI Research, NAVER Corp.)\",\"Recognition: Detection, Categorization, Retrieval\",Document Analysis,Poster,3.1,43,B,,,\n4722,Effective Aesthetics Prediction with Multi-level Spatially Pooled Features,Vlad Hosu (University of Konstanz)*; Bastian Goldluecke (University of Konstanz); Dietmar Saupe (University of Konstanz),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Others,Poster,3.1,44,B,,,\n4750,Attentive Region Embedding Network for Zero-shot Learning,Guo-Sen Xie (Inception Institute of Artificial Intelligence); Li Liu (the inception institute of artificial intelligence); Xiao-Bo Jin (Xi'an Jiaotong-Liverpool University); Fan Zhu (Inception Institute of Artificial Intelligence); Zheng Zhang (The University of Queensland)*; Jie Qin (Inception Institute of Artificial Intelligence); Yazhou Yao (Inception Institute of Artificial Intelligence); Ling Shao (Inception Institute of Artificial Intelligence),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,3.1,45,B,,,\n4755,Explicit Spatial Encoding for Deep Local Descriptors,\"Arun Mukundan (CTU in Prague)*; Giorgos Tolias (Vision Recognition Group, Czech Technical University in Prague); Ondrej Chum (Vision Recognition Group, Czech Technical University in Prague)\",\"Recognition: Detection, Categorization, Retrieval\",Representation Learning,Poster,3.1,46,B,,,\n4761,Panoptic Segmentation,Alexander Kirillov (Facebook AI Reserach)*; Kaiming He (Facebook AI Research); Ross Girshick (FAIR); Carsten Rother (University of Heidelberg); Piotr Dollar (FAIR),\"Recognition: Detection, Categorization, Retrieval\",\"Segmentation, Grouping and Shape\",Poster,3.1,47,B,,,\n4780,You reap what you sow: Generating High Precision Object Proposals for Weakly-supervised Object Detection,\"Krishna Kumar Singh (University of California Davis)*; Yong Jae Lee (University of California, Davis)\",\"Recognition: Detection, Categorization, Retrieval\",Video Analytics,Poster,3.1,48,B,,,\n4836,Explore-Exploit Graph Traversal for Image Retrieval,Guangwei Yu (Layer6.ai)*; Chundi Liu (Layer6.ai); Cheng Chang (Layer6.ai); Maksims Volkovs (Layer6 AI),\"Recognition: Detection, Categorization, Retrieval\",,Poster,3.1,49,B,,,\n4837,Dissimilarity Coefficient based Weakly Supervised Object Detection,Aditya Arun (IIIT Hyderabad)*; C.V. Jawahar (IIIT-Hyderabad); M. Pawan Kumar (University of Oxford),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.1,50,B,,,\n4887,Kernel Transformer Networks for Compact Spherical Convolution,Yu-Chuan Su (UT Austin)*; Kristen Grauman (Facebook AI Research & UT Austin),\"Recognition: Detection, Categorization, Retrieval\",,Poster,3.1,51,B,,,\n4977,Object detection with location-aware deformable convolution and backward attention filtering,Chen Zhang (Illinois Institute of Technology)*; Joohee Kim (Illinois Institute of Technology),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.1,52,B,,,\n5304,Variational Prototyping-Encoder: One-Shot Learning with Prototypical Images,Junsik Kim (Korea Advanced Institute of Science and Technology (KAIST))*; Tae-Hyun Oh (MIT CSAIL); Seokju Lee (Korea Advanced Institute of Science and Technology (KAIST)); Fei Pan (Korea Advanced Institute of Science and Technology); In So Kweon (KAIST),\"Recognition: Detection, Categorization, Retrieval\",Vision Applications and Systems,Poster,3.1,53,B,,,\n5319,Unsupervised Domain Adaptation using Feature-Whitening and Consensus Loss,Subhankar Roy (University of Trento)*; Aliaksandr Siarohin (University of Trento); Enver Sangineto (University of Trento); Samuel Rota Bulo' (Mapillary Research); Nicu Sebe (University of Trento); Elisa Ricci (FBK - Technologies of Vision),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.1,54,B,,,\n3471,UPSNet: A Unified Panoptic Segmentation Network,Yuwen Xiong (Uber ATG; University of Toronto)*; Renjie Liao (University of Toronto); Hengshuang Zhao (The Chinese University of Hong Kong); Rui Hu (Uber); Min Bai (University of Toronto); Ersin Yumer (Uber ATG); Raquel Urtasun (Uber ATG),Scene Analysis and Understanding,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape;\",Oral,3.1,55,C,,,\n5828,Joint Semantic-Instance Segmentation of 3D Point Clouds Using Multi-Set Label Conditional Random Fields,\"Quang-Hieu Pham (Singapore University of Technology and Design)*; Binh-Son Hua (The University of Tokyo); Thanh Nguyen (Deakin University, Australia); Gemma Roig (MIT); Sai-Kit Yeung (Hong Kong University of Science and Technology)\",Scene Analysis and Understanding,\"Deep Learning ; Segmentation, Grouping and Shape\",Oral,3.1,56,C,,,\n1303,Proposal-free instance segmentation with a clustering loss function,Davy Neven (KULeuven)*; Bert De Brabandere (KU Leuven); Marc Proesmans (KU Leuven); Luc Van Gool (ETH Zurich),\"Segmentation, Grouping and Shape\",\"Recognition: Detection, Categorization, Retrieval\",Oral,3.1,57,C,,,\n153,Deep Instance Co-segmentation by Co-peak Search and Co-saliency Detection,Kuang-Jui Hsu (Academia Sinica)*; Yen-Yu Lin (Academia Sinica); Yung-Yu Chuang (National Taiwan University),\"Segmentation, Grouping and Shape\",Datasets and Evaluation; Deep Learning ; Low-level Vision,Oral,3.1,58,C,,,\n4250,Improving Semantic Segmentation via Video Propagation and Label Relaxation,Yi Zhu (UC Merced); Karan Sapra (NVIDIA)*; Fitsum Reda (NVIDIA); Kevin Shih (NVIDIA); Shawn Newsam (UC Merced); Andrew Tao (NVIDIA); Bryan Catanzaro (NVIDIA),\"Segmentation, Grouping and Shape\",Deep Learning ; Image and Video Synthesis,Oral,3.1,59,C,,,\n3121,Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video,Samvit Jain (UC Berkeley)*; Xin Wang (UC Berkeley); Joey Gonzalez (Berkeley),\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Oral,3.1,60,C,,,\n4311,Shape2Motion: Joint Analysis of Motion Parts and Attributes from 3D Shapes,\"Xiaogang Wang (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University); Kai Xu (National University of Defense Technology)*; Yahao Shi (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University); Bin Zhou (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University); Xiaowu Chen (); Qinping Zhao (State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University)\",\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding; Vision + Graphics,Oral,3.1,61,C,,,\n256,Semantic Correlation Promoted Shape-Variant Context for Segmentation,Henghui Ding (Nanyang Technological University)*; Xudong Jiang (Nanyang Technological University); Bing Shuai (Amazon); Ai Qun Liu (Nanyang Technological University); Gang Wang (Alibaba Group),\"Segmentation, Grouping and Shape\",Deep Learning ; Scene Analysis and Understanding,Oral,3.1,62,C,,,\n2493,Relation-Shape Convolutional Neural Network for Point Cloud Analysis,\"Yongcheng Liu (Institute of Automation, Chinese Academy of Sciences)*; Bin Fan (Institute of Automation, Chinese Academy of Sciences, China); SHIMING XIANG (Chinese Academy of Sciences, China); Chunhong Pan (Institute of Automation, Chinese Academy of Sciences)\",\"Segmentation, Grouping and Shape\",Deep Learning ; Representation Learning,Oral,3.1,63,C,,,\n1233,Enhancing Diversity of Defocus Blur Detectors via Cross-Ensemble Network,Wenda Zhao (Dalian University of Technology)*; Bowen Zheng (Dalian University of Technology); Qiuhua Lin (Dalian University of Technology); Huchuan Lu (Dalian University of Technology),\"Segmentation, Grouping and Shape\",Deep Learning,Oral,3.1,64,C,,,\n2793,BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames,Brent Griffin (University of Michigan)*; Jason J Corso (University of Michigan),\"Segmentation, Grouping and Shape\",Datasets and Evaluation; Deep Learning ; Motion and Tracking,Oral,3.1,65,C,,,\n3074,Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-high Resolution Images,Wuyang Chen (Texas A&M University)*; Ziyu Jiang (Texas A&M University); Zhangyang Wang (TAMU); Kexin Cui (Texas A&M University); Xiaoning Qian (Texas A&M University),\"Segmentation, Grouping and Shape\",Deep Learning,Oral,3.1,66,C,,,\n3203,Efficient Parameter-free Clustering Using First Neighbor Relations,Saquib Sarfraz (Karlsruhe Institute of Technology)*; Vivek Sharma (Karlsruhe Institute of Technology); Rainer Stiefelhagen (Karlsruhe Institute of Technology),\"Segmentation, Grouping and Shape\",\"Big Data, Large Scale Methods ; Others; Representation Learning; Vision Applications and Systems\",Oral,3.1,67,C,,,\n3859,Learning Personalized Modular Network Guided by Structured Knowledge,Xiaodan Liang (Sun Yat-sen University)*,\"Segmentation, Grouping and Shape\",Deep Learning,Oral,3.1,68,C,,,\n5616,A Generative Appearance Model for End-to-end Video Object Segmentation,Joakim Johnander (Linköping University)*; Martin Danelljan (ETH Zurich); Emil Brissman (Linköping University); Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Michael Felsberg (Linköping University),\"Segmentation, Grouping and Shape\",Motion and Tracking,Oral,3.1,69,C,,,\n3335,FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation,Paul Voigtlaender (RWTH Aachen University)*; Yuning Chai (Alphabet); Bastian Leibe (RWTH Aachen University-); Florian Schroff (Google Inc.); Hartwig Adam (Google); Liang-Chieh Chen (Google Inc.),\"Segmentation, Grouping and Shape\",Deep Learning ; Motion and Tracking,Poster,3.1,70,C,,,\n4313,PartNet: A Recursive Part Decomposition Network for Hierarchical Segmentation of 3D Shapes,Fenggen Yu (Nanjing University); Kun Liu (Nanjing University); Yan Zhang (Nanjing University); Chenyang Zhu (Simon Fraser University); Kai Xu (National University of Defense Technology)*,\"Segmentation, Grouping and Shape\",Vision + Graphics,Poster,3.1,71,C,,,\n4874,Learning Multi-Class Segmentations From Single-Class Datasets,Konstantin Dmitriev (Stony Brook University)*; Arie Kaufman (Stony Brook University),\"Segmentation, Grouping and Shape\",\"Medical, Biological and Cell Microscopy; Scene Analysis and Understanding\",Poster,3.1,72,C,,,\n4944,Convolutional Recurrent Network for Road Boundary Extraction,\"Justin JL Liang (Uber ATG)*; Namdar Homayounfar (University of Toronto); Wei-Chiu Ma (MIT); Shenlong Wang (Uber ATG, University of Toronto); Raquel Urtasun (Uber ATG)\",\"Segmentation, Grouping and Shape\",Deep Learning ; Robotics + Driving; Vision Applications and Systems,Poster,3.1,73,C,,,\n5215,DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation,Hanchao Li (Beijing Institute of Technology)*; Pengfei Xiong (Megvii(face++) Research); Haoqiang Fan (Megvii Inc（face++）); Jian Sun (Megvii Technology),\"Segmentation, Grouping and Shape\",Scene Analysis and Understanding,Poster,3.1,74,C,,,\n5332,A Cross-Season Correspondence Dataset for Robust Semantic Segmentation,Måns Larsson (Chalmers)*; Erik Stenborg (Chalmers University); Lars Hammarstrand (Chalmers university of technology); Marc Pollefeys (ETH Zurich / Microsoft); Torsten Sattler (Chalmers University of Technology); Fredrik Kahl (Chalmers),\"Segmentation, Grouping and Shape\",Datasets and Evaluation; Deep Learning,Poster,3.1,75,C,,,\n5356,ManTra-Net: Manipulation Tracing Network For Detection And Localization ofImage Forgeries With Anomalous Features,Yue Wu (USC ISI)*; Wael Abd-Almageed (Information Sciences Institute); Prem Natarajan (USC ISI),\"Segmentation, Grouping and Shape\",Computational Photography; Deep Learning ; Image and Video Synthesis; Statistical Learning; Vision A,Poster,3.1,76,C,,,\n2609,\"Divergence Triangle for Joint Training of Generator Model, Energy-based Model, and Inferential Model\",\"Tian Han (University of California, Los Angeles)*; Erik Nijkamp (UCLA); Xiaolin Fang (Zhejiang University); Mitchell K Hill (UCLA Department of Statistics); Song-Chun Zhu (UCLA); Ying Nian Wu (University of California, Los Angeles)\",Statistical Learning,Deep Learning ; Image and Video Synthesis,Oral,3.1,77,D,,,\n2829,IMAGE DEFORMATION META-NETWORK FOR ONE-SHOT LEARNING,Zitian Chen (Fudan University); Yanwei Fu (Fudan University)*; Yu-Xiong Wang (Carnegie Mellon University); Lin Ma (Tencent AI Lab); Wei Liu (Tencent); Martial Hebert (Carnegie Mellon University),Statistical Learning,Representation Learning,Oral,3.1,78,D,,,\n4917,Online high-rank matrix completion,Jicong Fan (Cornell University)*; Madeleine Udell (Cornell University),Statistical Learning,\"Big Data, Large Scale Methods ; Motion and Tracking\",Oral,3.1,79,D,,,\n1274,Multispectral Imaging for Fine-Grained Recognition of Powders on Complex Backgrounds,Tiancheng Zhi (Carnegie Mellon University)*; Bernardo Pires (-); Martial Hebert (Carnegie Mellon School of Computer Science); Srinivasa G Narasimhan (Carnegie Mellon University),Physics-based Vision and Shape-from-X,Computational Photography,Oral,3.1,80,D,,,\n2138,ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging,\"Samarth Brahmbhatt (Georgia Institute of Technology)*; Cusuh Ham (Georgia Institute of Technology); Charlie Kemp (Georgia Institute of Technology); James Hays (Georgia Institute of Technology, USA)\",Datasets and Evaluation,\"Others; Robotics + Driving; Segmentation, Grouping and Shape\",Oral,3.1,81,D,,,\n4535,Robust Subspace Clustering with Independent and Piecewise Identically Distributed Noise Modeling,Yuanman Li (University of Macau); Jiantao Zhou (University of Macau)*; Xianwei Zheng (Foshan University); Jinyu Tian (University of Macau); Yuan Yan Tang (University of Macau),Computer Vision Theory,\"Representation Learning; Segmentation, Grouping and Shape; Vision Applications and Systems\",Oral,3.1,82,D,,,\n4185,What Correspondences Reveal about Unknown Camera and Motion Models?,Thomas Probst (ETH Zurich)*; Ajad Chhatkuli (ETH Zurich); Danda Pani Paudel (ETH Zürich); Luc Van Gool (ETH Zurich),Computer Vision Theory,3D from Multiview and Sensors,Oral,3.1,83,D,,,\n1504,Self-calibrating Deep Photometric Stereo Networks,Guanying Chen (The University of Hong Kong)*; Kai Han (University of Oxford); Boxin Shi (Peking University); Yasuyuki Matsushita (Osaka University); Kwan-Yee K. Wong (The University of Hong Kong),Physics-based Vision and Shape-from-X,Computational Photography,Oral,3.1,84,D,,,\n4994,Know Before You Go: 3D Tracking and Forecasting with Rich Maps,\"John W Lambert (Georgia Institute of Technology)*; James Hays (Georgia Institute of Technology, USA); Jagjeet Singh (CMU); MingFang Chang (Carnegie Mellon University); Simon Lucey (CMU); Deva Ramanan (Carnegie Mellon University); Patsorn Sangkloy (Georgia Institute of Technology); De Wang (Argo AI); Pete Carr (Argo AI)\",Datasets and Evaluation,3D from Multiview and Sensors; Robotics + Driving,Oral,3.1,85,D,,,\n5176,Side Window Filtering,Hui Yin (Shenzhen University); Yuanhao Gong (Shenzhen University); Guoping Qiu (Shenzhen University)*,Computer Vision Theory,Low-level Vision; Others,Oral,3.1,86,D,,,\n2319,Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search,Abhimanyu Dubey (Massachusetts Institute of Technology)*; Laurens van der Maaten (Facebook); Zeki Yalniz (Facebook); Yixuan Li (Facebook Research); Dhruv Mahajan (Facebook),\"Big Data, Large Scale Methods\",Datasets and Evaluation; Deep Learning ; Representation Learning; Visual Reasoning,Oral,3.1,87,D,,,\n1519,Incremental Object Learning from Contiguous Views,Stefan Stojanov (Georgia Institute of Technology)*; Samarth Mishra (Georgia Institute of Technology); Ngoc Anh Thai (Georgia Institute of Technology); Nikhil Dhanda (Georgia Institute of Technology); Ahmad Humayun (Georgia Institute of Technology); Linda Smith (Indiana University); Chen Yu (Indiana University); James Rehg (Georgia Institute of Technology),Datasets and Evaluation,\"Deep Learning ; Image and Video Synthesis; Recognition: Detection, Categorization, Retrieval\",Oral,3.1,88,D,,,\n3627,IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition,Xiaoping Wu (Nankai University); Chi Zhan (Nankai University); Yukun Lai (Cardiff University); Ming-Ming Cheng (Nankai University); Jufeng Yang (Nankai University )*,Datasets and Evaluation,Vision Applications and Systems,Oral,3.1,89,D,,,\n6334,CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification,ZHENG TANG (University of Washington)*; Milind Naphade (NVidia); Ming-Yu Liu (NVIDIA); Xiaodong Yang (NVIDIA Research); Stan Birchfield (NVIDIA); Shuo Wang (NVidia); Ratnesh Kumar (NVIDIA); David Anastasiu (SJSU); Jenq-Neng Hwang (University of WA�),Datasets and Evaluation,\"Big Data, Large Scale Methods ; Motion and Tracking\",Oral,3.1,90,D,,,\n6439,Social-IQ: A Question Answering Benchmark for Open-ended Social Intelligence,Amir Zadeh (CMU LTI)*; Michael K Chan (CMU); Paul Pu Liang (Carnegie Mellon University); Edmund Tong (CMU); Louis-Philippe Morency (Carnegie Mellon University),Datasets and Evaluation,Vision + Language ; Visual Reasoning,Oral,3.1,91,D,,,\n4302,On zero-shot recognition of generic objects,Tristan E.M Hascoet (Kobe University)*,Datasets and Evaluation,Vision + Language,Poster,3.1,92,D,,,\n4371,Explicit Bias Discovery in Visual Question Answering Models,\"Varun Manjunatha (University of Maryland, College Park)*; Nirat Saini (University of Maryland); Larry Davis (University of Maryland)\",Datasets and Evaluation,\"Big Data, Large Scale Methods ; Deep Learning ; Vision + Language ; Visual Reasoning\",Poster,3.1,93,D,,,\n4423,REPAIR: Removing Representation Bias by Dataset Resampling,Yi Li (UC San Diego)*; Nuno Vasconcelos (UC San Diego),Datasets and Evaluation,Action Recognition ; Deep Learning ; Representation Learning; Video Analytics,Poster,3.1,94,D,,,\n4744,Label Efficient Semi-Supervised Learning via Graph Filtering,Qimai LI (The Hong Kong PolyU)*; Xiao-Ming Wu (PolyU Hong Kong); Han Liu (The Hong Kong Polytechnic University); Xiaotong Zhang (The Hong Kong Polytechnic University); Zhichao GUAN (The Hong Kong Polytechnic University),Statistical Learning,,Poster,3.1,95,D,,,\n4769,A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection,Paul Bergmann (MVTec Software GmbH)*; Michael Fauser (MVTec Software GmbH); David Sattlegger (MVTec Software GmbH); Carsten Steger (MVTec Software GmbH),Datasets and Evaluation,Deep Learning,Poster,3.1,96,D,,,\n4878,ABC: A Big CAD Model Dataset For Geometric Deep Learning,Sebastian Koch (TUB)*; Albert Matveev (Skoltech); Zhongshi Jiang (New York University); Francis Williams (New York University); Alexey Artemov (Skoltech); Evgeny Burnaev (Skoltech); Marc Alexa (TUB); Denis Zorin (New York University); Daniele Panozzo (NYU),Datasets and Evaluation,\"Big Data, Large Scale Methods ; Deep Learning ; Recognition: Detection, Categorization, Retrieval; S\",Poster,3.1,97,D,,,\n4946,Tightness-aware Evaluation Protocol for Scene Text Detection,Yuliang Liu (South China University of Technology); Lianwen Jin (South China University of Technology)*; Zecheng Xie (South China University of Technology); Canjie Luo (South China University of Technology); Lele Xie (South China University of Technology); Shuaitao Zhang (South China University of Technology),Datasets and Evaluation,\"Document Analysis ; Others; Recognition: Detection, Categorization, Retrieval; Scene Analysis and Un\",Poster,3.1,98,D,,,\n4220,PointConv: Deep Convolutional Networks on 3D Point Clouds,Wenxuan Wu (Oregon State University)*; Zhongang Qi (Oregon State University); Li Fuxin (Oregon State University),3D from Multiview and Sensors,Deep Learning ; RGBD sensors and analytics; Robotics + Driving; Scene Analysis and Understanding; Se,Poster,3.1,99,E,,,\n4334,Octree guided CNN with Spherical Kernels for 3D Point Clouds,Huan Lei (The University of Western Australia)*; Naveed Akhtar (The University of Western Australia); Ajmal Mian (University of Western Australia),3D from Multiview and Sensors,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape\",Poster,3.1,100,E,,,\n4463,VITAMIN-E: VIsual Tracking And MappINg with Extremely Dense Feature Points,Masashi Yokozuka (National Institute of Advanced Industrial Science and Technology (AIST))*; Shuji Oishi (National Institute of Advanced Industrial Science and Technology (AIST)); Simon Thompson (AIST); Atsuhiko Banno (AIST),3D from Multiview and Sensors,Motion and Tracking,Poster,3.1,101,E,,,\n4478,Conditional Single-view Shape Generation for Multi-view Stereo Reconstruction,Yi Wei (Tsinghua University); Shaohui Liu (Tsinghua University); Wang Zhao (Tsinghua University); Jiwen Lu (Tsinghua University)*,3D from Multiview and Sensors,,Poster,3.1,102,E,,,\n4522,Learning to Adapt for Stereo,Alessio Tonioni (University of Bologna)*; Tom Joy (University of Oxford); Oscar Rahnama (University of Oxford); Luigi Di Stefano (University of Bologna); Thalaiyasingam Ajanthan (University of Oxford); Philip Torr (University of Oxford),3D from Multiview and Sensors,Deep Learning,Poster,3.1,103,E,,,\n4851,3D Appearance Super-Resolution with Deep Learning,Yawei Li (ETH Zurich)*; Vagia Tsiminaki (ETH Zurich); Radu Timofte (ETH Zurich); Marc Pollefeys (ETH Zurich / Microsoft); Luc Van Gool (ETH Zurich),3D from Multiview and Sensors,Datasets and Evaluation,Poster,3.1,104,E,,,\n4955,Radial Distortion Triangulation,Zuzana Kukelova (Czech Technical University in Prague)*; Viktor Larsson (ETH Zurich),3D from Multiview and Sensors,,Poster,3.1,105,E,,,\n5331,Robust Point Cloud Reconstruction of Large-Scale Outdoor Scenes,Ziquan Lan (NUS)*; Zi Jian Yew (National University of Singapore); Gim Hee Lee (National University of SIngapore),3D from Multiview and Sensors,Robotics + Driving,Poster,3.1,106,E,,,\n4180,Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment,\"Pedro Miraldo (KTH Royal Institute of Technology, Stockholm)*; Surojit Saha (University of Utah); Srikumar Ramalingam (University of Utah)\",RGBD sensors and analytics,3D from Multiview and Sensors,Poster,3.1,107,F,,,\n4204,Volumetric Capture of Humans with a Single RGBD Camera via Semi-Parametric Learning,Rohit Pandey (Google)*; Anastasia Tkach (Google); Shuoran Yang (Google); Pavel Pidlypenskyi (Google); Jonathan Taylor (Google Inc.); Ricardo Martin-Brualla (Google); Andrea Tagliasacchi (Google Inc.); George Papandreou (Google); Philip Davidson (Google Inc.); Cem Keskin (Google); Shahram Izadi (Google); Sean Fanello (Google),3D from Single Image,Deep Learning ; Image and Video Synthesis,Poster,3.1,108,F,,,\n4442,Joint Face Detection and Facial Motion Retargeting for Multiple Faces,BINDITA CHAUDHURI (University of Washington)*; Noranart Vesdapunt (Microsoft Research); Baoyuan Wang (Microsoft Research),3D from Single Image,\"Deep Learning ; Face, Gesture, and Body Pose ; Motion and Tracking\",Poster,3.1,109,F,,,\n4450,Monocular Depth Estimation Using Relative Depth Maps,Jaehan Lee (Korea University)*; Chang-Su Kim (Korea university),3D from Single Image,Low-level Vision,Poster,3.1,110,F,,,\n4635,Unsupervised Primitive Discovery for Improved 3D Generative Modeling,Salman Khan (Australian National University (ANU))*; Yulan Guo (National University of Defense Technology); Munawar Hayat (University of Canberra); Nick Barnes (CSIRO(Data61)),3D from Single Image,Deep Learning ; Image and Video Synthesis,Poster,3.1,111,F,,,\n4679,Learning to Explore Intrinsic Saliency for Stereoscopic Video,Qiudan ZHANG (City University of Hong Kong); Xu Wang (Shenzhen University)*; Shiqi Wang (CityU); Shikai LI (Shenzhen University); Sam Kwong (City Univeristy of Hong Kong); Jianmin Jiang (Shenzhen University),RGBD sensors and analytics,3D from Multiview and Sensors; Datasets and Evaluation; Deep Learning ; Video Analytics,Poster,3.1,112,F,,,\n4686,\"Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on n-Spheres\",Shuai Liao (University of Amsterdam)*; Stratis Gavves (University of Amsterdam); Cees Snoek (University of Amsterdam),3D from Single Image,,Poster,3.1,113,F,,,\n4818,Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation,Andrea Pilzer (Università di Trento)*; Stéphane Lathuiliere (university of Trento); Nicu Sebe (University of Trento); Elisa Ricci (FBK - Technologies of Vision),3D from Single Image,Deep Learning,Poster,3.1,114,F,,,\n4999,Learning View Priors for Single-view 3D Reconstruction,Hiroharu Kato (The University of Tokyo)*; Tatsuya Harada (The University of Tokyo),3D from Single Image,Deep Learning,Poster,3.1,115,F,,,\n5090,Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation,Shanshan Zhao (The University of Sydney)*; Huan Fu (The University of Sydney); Mingming Gong (University of Pittsburgh); Dacheng Tao (University of Sydney),3D from Single Image,Image and Video Synthesis,Poster,3.1,116,F,,,\n5256,Learning monocular depth estimation infusing traditional stereo knowledge,Fabio Tosi (University of Bologna); Filippo Aleotti (University of Bologna); Matteo Poggi (University of Bologna)*; Stefano Mattoccia (University of Bologna),3D from Single Image,3D from Multiview and Sensors; Deep Learning ; Low-level Vision,Poster,3.1,117,F,,,\n5263,SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception,\"Dinesh Bharadia (University of California San Diego); Tara Javidi (University of California San Diego); Gaurav Bansal (Airbus Labs); Rui Guo (Toyota InfoTechnology Center USA); Aman Raj (University of California San Diego); Samuel Sunarjo (University of California, San Diego); Yongxi Lu (Unviersity of California, San Diego); Yue Meng (University of California San Diego)*\",3D from Single Image,Deep Learning ; Motion and Tracking; Robotics + Driving; Scene Analysis and Understanding; Segmentat,Poster,3.1,118,F,,,\n4183,3D Guided Fine-Grained Face Manipulation,Zhenglin Geng (Stanford University); Chen Cao (Snap Inc.)*; Sergey Tulyakov (Snap Inc),\"Face, Gesture, and Body Pose\",Image and Video Synthesis,Poster,3.1,119,G,,,\n4206,Neuro-inspired Eye Tracking with Eye Movement Dynamics,Kang Wang (RPI)*; Hui Su (IBM); Qiang Ji (Rensselaer Polytechnic Institute),\"Face, Gesture, and Body Pose\",,Poster,3.1,120,G,,,\n4436,Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally,Xiuyi Jia (Nanjing University of Science and Technology)*; Xiang Zheng (Nanjing University of Science and Technology); Weiwei Li (Nanjing University of Aeronautics and Astronautics); Changqing Zhang (Tianjin university); Zechao Li (Nanjing University of Science and Technology),\"Face, Gesture, and Body Pose\",Computer Vision Theory,Poster,3.1,121,G,,,\n4531,Unsupervised Face Normalization with Extreme Pose and Expression in the Wild,Yichen Qian (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications)*; Jiani Hu (Beijing University of Posts and Telecommunications),\"Face, Gesture, and Body Pose\",Image and Video Synthesis,Poster,3.1,122,G,,,\n4586,Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision,Soubhik Sanyal (Max Planck Institute for Intelligent Systems)*; Timo Bolkart (Max Planck Institute for Intelligent Systems); Haiwen Feng (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems),\"Face, Gesture, and Body Pose\",3D from Single Image; Deep Learning,Poster,3.1,123,G,,,\n4625,R3 Adversarial Network for Cross Model Face Recognition,ken chen (sensetime); Yichao Wu (Sensetime Group Limited)*; Haoyu Qin (Sensetime); Ding Liang (Sensetime Group Limited); Xuebo Liu (SenseTime Group Ltd); Junjie Yan (Sensetime Group Limited),\"Face, Gesture, and Body Pose\",Representation Learning,Poster,3.1,124,G,,,\n4791,Disentangling Latent Hands for Image Synthesis and Pose Estimation,Linlin Yang ( University of Bonn); Angela Yao (National University of Singapore)*,\"Face, Gesture, and Body Pose\",Image and Video Synthesis; Representation Learning,Poster,3.1,125,G,,,\n5180,Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network,Chen Li (National University of Singapore)*; Gim Hee Lee (National University of SIngapore),\"Face, Gesture, and Body Pose\",Deep Learning,Poster,3.1,126,G,,,\n5207,CrossInfoNet: Multi-Task Information Sharing Based Hand Pose Estimation,Kuo Du (Dalian University of Technology); Xiangbo Lin (Dalian University of Technology)*; Yi Sun (Dalian University of Technology); Xiaohong Ma (Dalian University of Technology),\"Face, Gesture, and Body Pose\",\"3D from Single Image; Action Recognition ; Recognition: Detection, Categorization, Retrieval\",Poster,3.1,127,G,,,\n5285,P2SGrad: Refined Gradients for Optimizing Deep Face Models,\"Xiao Zhang (Chinese University of Hong Kong); Rui Zhao (SenseTime Group Limited); Junjie Yan (Sensetime Group Limited); Mengya Gao (Tianjin University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*; Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongsheng Li (Chinese University of Hong Kong)\",\"Face, Gesture, and Body Pose\",\"Biometrics ; Deep Learning ; Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,3.1,128,G,,,\n4560,Action Recognition from Single Timestamp Supervision in Untrimmed Videos,Davide Moltisanti (University of Bristol)*; Sanja Fidler (University of Toronto); Dima Damen (University of Bristol),Action Recognition,\"Recognition: Detection, Categorization, Retrieval; Video Analytics\",Poster,3.1,129,H,,,\n4749,Time-Conditioned Action Anticipation in One Shot,Qiuhong Ke (MPI)*; Mario Fritz (CISPA Helmholtz Center for Information Security); Bernt Schiele (MPI Informatics),Action Recognition,Deep Learning,Poster,3.1,130,H,,,\n4882,Dance with Flow: Two-in-One Stream Action Detection,JIAOJIAO ZHAO (University of Amsterdam)*; Cees Snoek (University of Amsterdam),Action Recognition,,Poster,3.1,131,H,,,\n4973,Representation Flow for Action Recognition,AJ Piergiovanni (Indiana University)*; Michael S Ryoo (Google Brain; Indiana University),Action Recognition,Deep Learning ; Representation Learning,Poster,3.1,132,H,,,\n5009,LSTA: Long Short-Term Attention for Egocentric Action Recognition,\"Swathikiran Sudhakaran (Fondazione Bruno Kessler, Italy)*; Sergio Escalera (Computer Vision Center (UAB) & University of Barcelona,); Oswald Lanz (Fondazione Bruno Kessler, Italy)\",Action Recognition,Deep Learning,Poster,3.1,133,H,,,\n5028,Learning Actor Relation Graphs for Group Activity Recognition,Jianchao Wu (Nanjing University ); Limin Wang (Nanjing University)*; Li Wang (NanJing University); Jie Guo (Nanjing University); Gangshan Wu (Nanjing University),Action Recognition,Video Analytics,Poster,3.1,134,H,,,\n5156,A Structured Model For Action Detection,Yubo Zhang (Carnegie Mellon University)*; Pavel Tokmakov (CMU); Cordelia Schmid (INRIA); Martial Hebert (Carnegie Mellon School of Computer Science),Action Recognition,Video Analytics,Poster,3.1,135,H,,,\n5185,Out-of-Distribution Detection for Generalized Zero-Shot Action Recognition,\"Devraj Mandal (Indian Institute of Science, Bangalore); Sanath Narayan (Inception Institute of Artificial Intelligence)*; Sai Kumar Dwivedi (Mercedes Benz Research and Development India); vikram gupta (MBRDI); Shuaib Ahmed (Mercedes-Benz R&D India); Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Ling Shao (Inception Institute of Artificial Intelligence)\",Action Recognition,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,136,H,,,\n4179,Object Discovery in Videos as Foreground Motion Clustering,Christopher Xie (University of Washington)*; Yu Xiang (Nvidia); Dieter Fox (NVIDIA); Zaid Harchaoui (University of Washington),Motion and Tracking,\"Deep Learning ; Segmentation, Grouping and Shape; Video Analytics\",Poster,3.1,137,I,,,\n4338,Towards Natural and Accurate Future Motion Prediction of Humans and Animals,Zhenguang Liu (Zhejiang Gongshang University)*; Shuang Wu (Nanyang Technological University); Shuyuan Jin (NUS); Qi Liu (National University of Singapore); Shijian Lu (Nanyang Technological University); Roger Zimmermann (NUS); Li Cheng (University of Alberta),Motion and Tracking,Deep Learning,Poster,3.1,138,I,,,\n4366,Automatic Face Aging in Videos via Deep Reinforcement Learning,Chi Nhan Duong ( Concordia University)*; Khoa Luu (University of Arkansas); Kha Gia Quach (Concordia University); Nghia H Nguyen (University of Arkansas); Eric Patterson (Clemson University); Tien D Bui (Concordia); Ngan Le (Carnegie Mellon University),Biometrics,\"Deep Learning ; Face, Gesture, and Body Pose ; Image and Video Synthesis\",Poster,3.1,139,I,,,\n5369,Multi-adversarial Discriminative Deep Domain Generalization for Face Presentation Attack Detection,\"Rui Shao (Department of Computer Science, Hong Kong Baptist University)*; Xiangyuan Lan (Department of Computer Science, Hong Kong Baptist University); Jiawei Li (Hong Kong Baptist University); PongChi Yuen (Department of Computer Science, Hong Kong Baptist University)\",Biometrics,Vision Applications and Systems,Poster,3.1,140,I,,,\n4284,Using a Transformation Content Block For Image Style Transfer,Dmytro Kotovenko (Heidelberg University)*; Artsiom O Sanakoyeu (Heidelberg University); Bjorn Ommer (Heidelberg University); Sabine Lang (Heidelberg University); Pingchuan Ma (Heidelberg University),Image and Video Synthesis,Computational Photography; Deep Learning,Poster,3.1,141,J,,,\n4378,BeautyGlow: On-Demand Makeup Transfer Framework with Reversible Generative Network,\"Hung-Jen Chen (National Chiao Tung University); Ka Ming Hui (National Chiao Tung University); Szu Yu Wang (National Chiao Tung University); Li-Wu Tsao (National Chiao Tung University); Hong-Han Shuai (National Chiao Tung University)*; Wen-Huang Cheng (EE, NCTU)\",Image and Video Synthesis,Deep Learning,Poster,3.1,142,J,,,\n4740,Style Transfer by Relaxed Optimal Transport and Self-Similarity,Nicholas I Kolkin (Toyota Technological Institute at Chicago)*; Jason Salavon (University of Chicago); Greg Shakhnarovich (TTI-Chicago),Image and Video Synthesis,Vision + Graphics,Poster,3.1,143,J,,,\n4906,Inserting Videos into Videos,Donghoon Lee (Seoul National University)*; Tomas Pfister (Google); Ming-Hsuan Yang (University of California at Merced),Image and Video Synthesis,Deep Learning ; Video Analytics; Vision Applications and Systems,Poster,3.1,144,J,,,\n5027,Learning Image and Video Compression through Spatial-Temporal Energy Compaction,\"Zhengxue Cheng (Waseda University)*; Heming Sun (Waseda University, Japan); Masaru Takeuchi (Waseda University); Jiro Katto (Waseda University)\",Image and Video Synthesis,Deep Learning ; Representation Learning; Video Analytics; Vision Applications and Systems,Poster,3.1,145,J,,,\n5141,Event-based High Dynamic Range Image and Very High Frame Rate Video Generation using Conditional Generative Adversarial Networks,S. Mohammad Mostafavi I. (GIST); Lin Wang (KAIST); Yo-Sung HO (GIST); Kuk-Jin Yoon (KAIST)*,Image and Video Synthesis,Computational Photography; Deep Learning ; Low-level Vision,Poster,3.1,146,J,,,\n5293,Enhancing TripleGAN for Semi-Supervised Conditional Instance Synthesis and Classification,Si Wu (South China University of Technology)*; Guangchang Deng (South China University of Technology); Jichang Li (South China University of Technology); Rui Li (City University of Hong Kong); Zhiwen Yu (South China University of Technology); Hau San Wong (City University of Hong Kong),Image and Video Synthesis,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,3.1,147,J,,,\n4443,\"Capture, Learning, and Synthesis of 3D Speaking Styles\",Daniel Cudeiro (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Cassidy Laidlaw (Max Planck Institute for Intelligent Systems); Anurag Ranjan (MPI for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems)*,Vision + Graphics,\"Deep Learning ; Face, Gesture, and Body Pose\",Poster,3.1,148,K,,,\n4550,Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds using Convolutional Neural Networks,Yizhak Ben-Shabat (Technion)*; Anath Fischer (Technion); Michael Lindenbaum (Technion),Vision + Graphics,Deep Learning,Poster,3.1,149,K,,,\n4624,Ray-Space Projection Model for Light Field Camera,Qi Zhang (Northwestern Polytechnical University); Jinbo Ling (Northwestern Polytechnical University); Qing Wang (Northwestern Polytechnical University)*; Jingyi Yu (Shanghai Tech University),Computational Photography,Computer Vision Theory,Poster,3.1,150,K,,,\n4806,Deep Geometric Prior for Surface Reconstruction,\"Francis Williams (New York University)*; Teseo Schneider (NYU Courant Institute); Claudio Silva (NYU); Denis Zorin (New York University); Joan Bruna (Courant Institute of Mathematical Sciences, NYU, USA); Daniele Panozzo (NYU)\",Vision + Graphics,Deep Learning ; RGBD sensors and analytics,Poster,3.1,151,K,,,\n4814,Analysis of Feature Visibility in Non-Line-of-Sight Measurements,Xiaochun Liu (University of Wisconsin - Madison)*; Sebastian Bauer (UW Madison); Andreas Velten (University of Wisconsin - Madison),Computational Photography,,Poster,3.1,152,K,,,\n5081,Hyperspectral Imaging with Random Printed Mask,Yuanyuan Zhao (Nanjing University); Hui Guo (Nanjing University); Zhan Ma (Nanjing University); Xun Cao (Nanjing University); Tao Yue (Nanjing University); Xuemei Hu (Nanjing University)*,Computational Photography,,Poster,3.1,153,K,,,\n5128,All-Weather Deep Outdoor Lighting Estimation,Jinsong Zhang (Université Laval); Kalyan Sunkavalli (Adobe Research); Yannick Hold-Geoffroy (Adobe Research); Sunil Hadap (Adobe); Jonathan Eisenman (Adobe Systems); Jean-Francois Lalonde (Université Laval)*,Vision + Graphics,Computational Photography; Deep Learning,Poster,3.1,154,K,,,\n4186,A variational EM framework with adaptive edge selection for blind motion deblurring,Liuge Yang (National University of Singapore); Hui Ji (National University of Singapore)*,Low-level Vision,Computational Photography,Poster,3.1,155,L,,,\n4482,Viewport Proposal CNN for 360° Video Quality Assessment,Chen Li (BUAA); Mai Xu (BUAA)*; Lai Jiang (BUAA); Shanyi Zhang (BUAA); Xiaoming Tao (Tsinghua University),Low-level Vision,,Poster,3.1,156,L,,,\n4685,Beyond Gradient Descent for Regularized Segmentation Losses,Dmitrii Marin (University of Waterloo)*; Meng Tang (University of Waterloo); Ismail Ben Ayed (ETS Montreal); Yuri Boykov (University of Waterloo),Optimization Methods,\"Deep Learning ; Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,3.1,157,L,,,\n4694,MAGSAC: marginalizing sample consensus,\"Dániel Baráth (MTA SZTAKI, CMP Prague)*; Jiri Matas (CMP CTU FEE); Jana Noskova (CMP CTU FEE)\",Low-level Vision,\"3D from Multiview and Sensors; Segmentation, Grouping and Shape\",Poster,3.1,158,L,,,\n4725,Understanding and Visualizing Deep Visual Saliency Models,Sen He (University of Exeter)*; Hamed Rezazadegan Tavakoli (Aalto University); Ali Borji (University of Central Florida); Yang Mi (University of Exeter); Nicolas Pugeault (Exeter),Low-level Vision,Deep Learning,Poster,3.1,159,L,,,\n4783,Divergence Prior and Vessel-tree Reconstruction,Zhongwen Zhang (University of Western Ontario); Dmitrii Marin (University of Waterloo)*; Egor Chesakov (University of Western Ontario (former PhD student)); Yuri Boykov (University of Waterloo); Maria Drangova (Robarts Research Institute); Marc Moreno Maza (University of Western Ontario),Optimization Methods,\"Medical, Biological and Cell Microscopy\",Poster,3.1,160,L,,,\n4819,Unsupervised Domain-Specific Deblurring via Disentangled Representations,Boyu Lu (University of Maryland)*; Jun-Cheng Chen (University of Maryland); Rama Chellappa (University of Maryland),Low-level Vision,Deep Learning,Poster,3.1,161,L,,,\n4897,Douglas-Rachford Networks: Learning Both the Image Prior and Data Fidelity Terms for Blind Image Deconvolution,Raied RA Aljadaany (CMU)*; Dipan K Pal (Carnegie Mellon University); Marios Savvides (Carnegie Mellon University),Low-level Vision,,Poster,3.1,162,L,,,\n4920,Speed Invariant Time Surface for Learning to Detect Corner Points with Event-Based Cameras,Jacques Manderscheid (PROPHESEE); Amos Sironi (PROPHESEE)*; Nicolas Bourdis (PROPHESEE); Davide Migliore (PROPHESEE); Vincent Lepetit (L'Universite de Bordeaux),Low-level Vision,,Poster,3.1,163,L,,,\n4975,Training deep learning based image denoisers from undersampled measurements without ground truth and without image prior,Magauiya Zhussip (UNIST); Shakarim Soltanayev (UNIST); Se Young Chun (Ulsan National Institute of Science and Technology)*,Low-level Vision,\"Computational Photography; Medical, Biological and Cell Microscopy\",Poster,3.1,164,L,,,\n5294,A Variational Pan-Sharpening with Local Gradient Constraints,Xinghao Ding (Xiamen University); Zihuang Lin (xiamen university); Xueyang Fu (Xiamen University); Yue Huang (Xiamen University)*,Low-level Vision,,Poster,3.1,165,L,,,\n3745,f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning,Yongqin Xian (Max Planck Institute Informatics)*; Saurabh Sharma (Max Planck Institute for Informatics); Bernt Schiele (MPI Informatics); Zeynep Akata (University of Amsterdam),Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,3.1,166,M,,,\n4238,Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation,Chen-Yu Lee (Apple)*; Tanmay Batra (Apple); Mohammad Haris Baig (Apple); Daniel Ulbricht (apple),Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding\",Poster,3.1,167,M,,,\n4649,Graph Attention Convolution for Point Cloud Segmentation,Lei Wang (Wuhan University)*; Yuchun Huang (Wuhan University); Yaolin Hou (Wuhan University); Shenman Zhang (Wuhan University); Jie Shan (Purdue),Scene Analysis and Understanding,\"3D from Multiview and Sensors; Deep Learning ; Segmentation, Grouping and Shape\",Poster,3.1,168,M,,,\n4658,Normalized Diversification,Shaohui Liu (Tsinghua University)*; Xiao Zhang (University of Pennsylvania); Jianqiao Q Wangni (University of Pennsylvania); Jianbo Shi (University of Pennsylvania),Representation Learning,,Poster,3.1,169,M,,,\n4890,Learning to Localize through Compressed Binary Maps,\"Xinkai Wei (Uber ATG, University of Waterloo)*; Shenlong Wang (Uber ATG, University of Toronto); Julieta Martinez (Uber ATG); Ioan Andrei Bârsan (Uber ATG, University of Toronto); Raquel Urtasun (Uber ATG)\",Representation Learning,Robotics + Driving,Poster,3.1,170,M,,,\n4914,A Parametric Top-View Representation of Complex Road Scenes,Ziyan Wang (CMU); Buyu Liu (NEC Labs); Samuel Schulter (NEC Labs)*; Manmohan Chandraker (UC San Diego),Scene Analysis and Understanding,3D from Single Image; Deep Learning ; Robotics + Driving; Visual Reasoning,Poster,3.1,171,M,,,\n5096,Self-supervised Spatiotemporal Learning via Video Clip Order Prediction,Dejing Xu (Zhejiang University)*; Jun Xiao (Zhejiang University); Zhou Zhao (Zhejiang University); Jian Shao (Zhejiang University); Di Xie (Hikvision Research Institute); Yueting Zhuang (Zhejiang University),Representation Learning,\"Action Recognition ; Recognition: Detection, Categorization, Retrieval; Video Analytics\",Poster,3.1,172,M,,,\n5106,Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids,Despoina Paschalidou (MPI-IS Tuebingen )*; Ali O Ulusoy (Microsoft); Andreas Geiger (MPI-IS and University of Tuebingen),Scene Analysis and Understanding,\"Physics-based Vision and Shape-from-X; Segmentation, Grouping and Shape; Visual Reasoning\",Poster,3.1,173,M,,,\n5227,Unsupervised Disentangling of Appearance and Geometry by Deformable Generator Network,\"Xianglei Xing (Harbin Engineering University)*; Tian Han (University of California, Los Angeles); Ruiqi Gao (UCLA); Song-Chun Zhu (UCLA); Ying Nian Wu (University of California, Los Angeles)\",Representation Learning,\"Deep Learning ; Face, Gesture, and Body Pose ; Image and Video Synthesis; Statistical Learning\",Poster,3.1,174,M,,,\n5242,Unsupervised Representation Learning by Rotation Feature Decoupling,Zeyu Feng (University of Sydney)*; Chang Xu (University of Sydney); Dacheng Tao (University of Sydney),Representation Learning,Deep Learning ; Scene Analysis and Understanding,Poster,3.1,175,M,,,\n5248,Weakly Supervised Deep Image Hashing through Tag Embeddings,Vijetha R Gattupalli (Arizona State University)*; Yaoxin Zhuo (Arizona State University); baoxin Li (Arizona State University),Representation Learning,Deep Learning,Poster,3.1,176,M,,,\n5278,Improved Road Connectivity by Joint Learning of Orientation and Segmentation,\"Anil Batra (IIIT)*; Suriya Singh (IIIT, Mila); Guan Pang (Facebook); Saikat Basu (Facebook); C.V. Jawahar (IIIT-Hyderabad); Manohar Paluri (Facebook)\",Scene Analysis and Understanding,\"Segmentation, Grouping and Shape\",Poster,3.1,177,M,,,\n5287,Deep Supervised Cross-modal Retrieval,\"Liangli Zhen (Institute of High Performance Computing, A*STAR); Peng Hu (Sichuan University); Xu Wang (Sichuan University); Dezhong Peng (Sichuan University)*\",Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,178,M,,,\n5310,A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning,\"Thanh-Toan Do (The University of Liverpool)*; Toan M Tran (University of Adelaide); Ian Reid (\"\"University of Adelaide, Australia\"\"); Vijay Kumar (Xerox); Tuan NA Hoang (Singapore University of Technology and Design); Gustavo Carneiro (University of Adelaide)\",Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,3.1,179,M,,,\n5341,Data Representation and Learning with Graph Diffusion-Embedding Networks,Bo Jiang (Anhui University)*; Doudou Lin (Anhui University); Jin Tang (Anhui University); Bin Luo (Anhui University),Representation Learning,Deep Learning,Poster,3.1,180,M,,,\n4277,Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph,\"Yao-Hung Tsai (Carnegie Mellon University)*; Santosh Divvala (Allen Institute for AI); Louis-Philippe Morency (Carnegie Mellon University); Ruslan Salakhutdinov (Carnegie Mellon University); Ali Farhadi (University of Washington, Allen Institute for Artificial Intelligence)\",Visual Reasoning,Statistical Learning,Poster,3.1,181,N,,,\n4385,Image-Question-Answer Synergistic Network for Visual Dialog,Dalu Guo (University of Sydney)*; Chang Xu (University of Sydney); Dacheng Tao (University of Sydney),Vision + Language,Deep Learning,Poster,3.1,182,N,,,\n4732,Not All Frames Are Equal: Weakly-Supervised Video Grounding with Contextual Similarity and Visual Clustering Losses,Jing Shi (University of Rochester)*; Jia Xu (Tencent AI Lab); Boqing Gong (Tencent AI Lab); Chenliang Xu (University of Rochester),Vision + Language,Video Analytics,Poster,3.1,183,N,,,\n4781,Inverse Cooking: Recipe Generation from Food Images,Amaia Salvador (Universitat Politècnica de Catalunya)*; Michal Drozdzal (FAIR); Xavier Giro-i-Nieto (Universitat Politecnica de Catalunya); Adriana Romero (FAIR),Vision + Language,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Vision Applications and Systems\",Poster,3.1,184,N,,,\n4833,Adversarial Semantic Alignment for Improved Image Captions,Pierre Dognin (IBM)*; Igor Melnyk (IBM); Youssef Mroueh (IBM Research); Jarret Ross (IBM); Tom Sercu (IBM Research AI),Vision + Language,Deep Learning,Poster,3.1,185,N,,,\n4919,Answer Them All: Toward a Universal VQA Model,Robik S Shrestha (Rochester Institute of Technology)*; Kushal Kafle (Rochester Institute of Technology); Christopher Kanan (Rochester Institute of Technology),Vision + Language,Visual Reasoning,Poster,3.1,186,N,,,\n4967,Unsupervised Multi-modal Neural Machine Translation,Yuanhang Su (University of Southern California); Kai Fan (Alibaba DAMO Academy)*; Nguyen Bach (Alibaba); C.-C. Jay Kuo (USC); Fei Huang (Alibaba),Vision + Language,,Poster,3.1,187,N,,,\n4987,Multi-task Learning of Hierarchical Vision-Language Representation,Duy Kien Nguyen (Tohoku University)*; Takayuki Okatani (Tohoku University/RIKEN AIP),Vision + Language,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Representation Learning\",Poster,3.1,188,N,,,\n5052,Cross-Modal Self-Attention Network for Referring Image Segmentation,\"Linwei Ye (University of Manitoba)*; Mrigank Rochan (University of Manitoba); Zhi Liu (Shanghai University, China); Yang Wang (University of Manitoba)\",Vision + Language,\"Scene Analysis and Understanding; Segmentation, Grouping and Shape\",Poster,3.1,189,N,,,\n2180,Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning from Radiology Reports and Label Ontology,\"Ke Yan (National Institutes of Health)*; Yifan Peng (NIH); Veit Sanfort (NIH); Mohammadhadi Bagheri (National Institutes of Health); Zhiyong Lu (NLM/NCBI/NIH); Ronald Summers (National Institutes of Health, Bethesda, Maryland, United States)\",\"Medical, Biological and Cell Microscopy\",\"Datasets and Evaluation; Deep Learning ; Recognition: Detection, Categorization, Retrieval; Represen\",Oral,3.1,190,O,,,\n4246,Robust Histopathology Image Analysis: to Label or to Synthesize?,Le Hou (Stony Brook University)*; Ayush Agarwal (Stanford University); Dimitris Samaras (Stony Brook University); Tahsin Kurc (Stony Brook University); Rajarsi Gupta (Stony Brook University); Joel Saltz (Stony Brook University),\"Medical, Biological and Cell Microscopy\",\"Segmentation, Grouping and Shape; Vision Applications and Systems\",Oral,3.1,191,O,,,\n6477,Data augmentation with spatial and appearance transforms for one-shot medical image segmentation,Amy Zhao (MIT)*; Guha Balakrishnan (MIT); Fredo Durand (MIT); John Guttag (MIT); Adrian V Dalca (MIT),\"Medical, Biological and Cell Microscopy\",Image and Video Synthesis,Oral,3.1,192,O,,,\n1853,Shifting More Attention to Video Salient Object Detection,Deng-Ping Fan (Nankai University); Wenguan Wang (Inception Institute of Artificial Intelligence); Ming-Ming Cheng (Nankai University)*; Jianbing Shen (Beijing Institute of Technology),Vision Applications and Systems,Low-level Vision,Oral,3.1,193,O,,,\n864,Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration,De-An Huang (Stanford University)*; Suraj Nair (Stanford University); Danfei Xu (Stanford University); Yuke Zhu (Stanford University); Animesh Garg (Stanford University); Li Fei-Fei (Stanford University); Silvio Savarese (Stanford University); Juan Carlos Niebles (Stanford University),Robotics + Driving,Visual Reasoning,Oral,3.1,194,O,,,\n1296,Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry,\"Fei Xue (Peking University)*; Xin Wang (Peking University); Shunkai Li (Peking University); Qiuyuan Wang (Peking University); Junqiu Wang (Beijing Changcheng Aviation Measurement and Control Institute); Hongbin Zha (Peking University, China)\",Robotics + Driving,Deep Learning ; Motion and Tracking,Oral,3.1,195,O,,,\n3139,Image Generation from Layout,Bo Zhao (University of British Columbia)*; Lili Meng (University of British Columbia); Weidong Yin (University of British Columbia); Leonid Sigal (University of British Columbia),Vision Applications and Systems,Vision + Graphics,Oral,3.1,196,O,,,\n4603,Multimodal Explanations by Predicting Counterfactuality in Videos,Atsushi Kanehira (The University of Tokyo)*; Kentaro Takemoto (University of Tokyo); Sho Inayoshi (The University of Tokyo); Tatsuya Harada (The University of Tokyo),Vision Applications and Systems,Video Analytics; Vision + Language,Oral,3.1,197,O,,,\n4606,Learning to Explain with Complemental Examples,Atsushi Kanehira (The University of Tokyo)*; Tatsuya Harada (The University of Tokyo),Vision Applications and Systems,Vision + Language,Oral,3.1,198,O,,,\n3441,HAQ: Hardware-Aware Automated Quantization,Kuan Wang (MIT); Zhijian Liu (MIT); Yujun Lin (MIT); Ji Lin (MIT); Song Han (MIT)*,Vision Applications and Systems,Deep Learning,Oral,3.1,199,O,,,\n4965,Content Authentication for Neural Imaging Pipelines: End-to-end Optimization of Photo Provenance in Complex Distribution Channels,Pawel Korus (New York University)*; Nasir Memon (New York University),Vision Applications and Systems,Low-level Vision,Oral,3.1,200,O,,,\n5712,Inverse Procedural Modeling of Knitwear,Elena Trunz (University of Bonn)*; Sebastian Merzbach (University of Bonn); Jonathan Klein (University of Bonn); Thomas Schulze (University of Bonn); Michael Weinmann (University of Bonn); Reinhard Klein (University of Bonn),Vision Applications and Systems,,Oral,3.1,201,O,,,\n2857,Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video,Zongmian Li (INRIA Paris)*; Jiri Sedlar (CVUT); Justin Carpentier (INRIA); Ivan Laptev (INRIA Paris); Nicolas Mansard (LAAS-CNRS); Josef Sivic (INRIA),Robotics + Driving,Action Recognition ; Video Analytics,Oral,3.1,202,O,,,\n4235,DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds,Li Ding (University of Rochester); Chen Feng (New York University)*,Robotics + Driving,3D from Multiview and Sensors; Deep Learning ; RGBD sensors and analytics,Oral,3.1,203,O,,,\n4880,End-to-end Interpretable Neural Motion Planner,\"Wenyuan Zeng (Uber ATG, University of Toronto); Wenjie Luo (Uber ATG / University of Toronto)*; Shun Da Suo (Uber ATG, University of Toronto); Abbas Sadat (Uber ATG); Bin Yang (Uber ATG & University of Toronto); Sergio Casas Romero (Uber ATG, University of Toronto); Raquel Urtasun (Uber ATG)\",Robotics + Driving,,Oral,3.1,204,O,,,\n4261,DuDoNet: Dual Domain Network for CT Metal Artifact Reduction,\"Wei-An Lin (The University of Maryland, College Park)*; Haofu Liao (University of Rochester); Cheng Peng (The University of Maryland, College Park); Xiaohang Sun (Princeton University); Jingdan Zhang (Z2AI Corporation); Jiebo Luo (U. Rochester); Rama Chellappa (University of Maryland); S. Kevin Zhou (ICT)\",\"Medical, Biological and Cell Microscopy\",,Poster,3.1,205,O,,,\n4384,Fast Spatio-Temporal Residual Network for Video Super-Resolution,\"Sheng Li (School of Computer Science, Wuhan University); Fengxiang He (The University of Sydney); Bo Du (School of Compuer Science, Wuhan University)*; Lefei Zhang (Wuhan University); Yonghao Xu (Wuhan University); Dacheng Tao (University of Sydney)\",Vision Applications and Systems,Computational Photography,Poster,3.1,206,O,,,\n4419,Complete the Look: Scene-based Complementary Product Recommendation,Wang-Cheng Kang (UC San Diego)*; Eric Kim (Pinterest); Jure Leskovec (Stanford); Charles Rosenberg (Pinterest); Julian McAuley (UCSD),Vision Applications and Systems,,Poster,3.1,207,O,,,\n4628,Selective Sensor Fusion for Neural Visual-Inertial Odometry,Changhao Chen (University of Oxford)*; Stefano Rosa (University of Oxford); Yishu Miao (University of Oxford); Chris Xiaoxuan Lu (University of Oxford); Wei Wu (Tencent); Andrew Markham (University of Oxford); Niki Trigoni (University of Oxford),Robotics + Driving,,Poster,3.1,208,O,,,\n4675,Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes,Chengquan Zhang (Baidu Inc)*; Borong Liang (Xiamen University); Zuming Huang (Baidu Inc.); Mengyi En (Baidu Inc.); Junyu Han (Baidu Inc.); Errui Ding (Baidu Inc.); Xinghao Ding (Xiamen University),Document Analysis,\"Recognition: Detection, Categorization, Retrieval; Segmentation, Grouping and Shape; Vision Applicat\",Poster,3.1,209,O,,,\n4765,Learning Binary Code for Personalized Fashion Recommendation,Zhi Lu (University of Electronic Science and Technology of China); Yang Hu (University of Electronic Science and Technology of China)*; Yunchao Jiang (University of Electronic Science and Technology of China); Yan Chen (University of Electronic Science and Technology of China); Bing Zeng (University of Electronic Science and Technology of China),Vision Applications and Systems,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.1,210,O,,,\n4789,Attention Based Glaucoma Detection: A Large-scale Database and CNN Model,\"Liu Li (BUAA); Xiaofei Wang (BeiHang University); Lai Jiang (BUAA); Hanruo Liu (Beijing Tongren Hospital, Capital Medical University); Mai Xu (BUAA)*\",Vision Applications and Systems,,Poster,3.1,211,O,,,\n4829,Privacy Protection in Street-View Panoramas using Depth and Multi-View Imagery,Ries M.C. Uittenbogaard (Royal IHC); Clint Sebastian (Eindhoven University of Technology)*; Julien A. Vijverberg (CycloMedia B.V.); Bas Boom (Cyclomedia); Dariu Gavrila (TU Delft); P. H. N. de With (Eindhoven University of Technology),Vision Applications and Systems,Others,Poster,3.1,212,O,,,\n4838,Grounding Human-to-Vehicle Advice for Self-driving Vehicles,Jinkyu Kim (UC Berkeley)*; Teruhisa Misu (Honda Research Institute USA); Yi-Ting Chen (Honda Research Institute USA); Ashish Tawari (Honda Research Institute); John F Canny (UC Berkeley),Robotics + Driving,Deep Learning ; Vision + Language,Poster,3.1,213,O,,,\n4855,Multi-Step Prediction of Occupancy Grid Maps with Recurrent Neural Networks,Nima Mohajerin (Huawei Noah's Ark)*; Mohsen Rohani (Huawei Noah's Ark),Robotics + Driving,Deep Learning,Poster,3.1,214,O,,,\n4903,Connecting Touch and Vision via Cross-Modal Prediction,Yunzhu Li (MIT)*; Jun-Yan Zhu (MIT); Russ Tedrake (MIT); Antonio Torralba (MIT),Vision Applications and Systems,Image and Video Synthesis,Poster,3.1,215,O,,,\n5238,X2CT-GAN: Reconstructing CT from Biplanar X-Rays with Generative Adversarial Networks,Xingde Ying (ZheJiang University ); Heng Guo (Shanghai Jiaotong University); Kai Ma (Tencent); Jian Wu (Zhejiang University); Zhengxin Weng (Shanghai Jiaotong University); Yefeng Zheng (Tencent)*,\"Medical, Biological and Cell Microscopy\",3D from Multiview and Sensors; Deep Learning ; Image and Video Synthesis,Poster,3.1,216,O,,,\n3041,Practical Full Resolution Learned Lossless Image Compression,Fabian Mentzer (ETH Zurich)*; Eirikur Agustsson (ETH Zurich); Michael Tschannen (ETH Zurich); Radu Timofte (ETH Zurich); Luc Van Gool (ETH Zurich),Deep Learning,Representation Learning,Oral,3.2,1,A,,,\n6930,Image-to-Image Translation via Group-wise Deep Whitening and Coloring,Wonwoong Cho (Korea University); Sungha Choi (Korea University); David Park (Korea University); Inkyu Shin (Hanyang University); Jaegul Choo (Korea University)*,Deep Learning,Computer Vision Theory; Image and Video Synthesis; Vision Applications and Systems,Oral,3.2,2,A,,,\n6265,Max-Sliced Wasserstein Distance and its use for GANs,\"Ishan Deshpande (-); Yuan-Ting Hu (University of Illinois at Urbana-Champaign); Ruoyu Sun (University of Illinois at Urbana-Champaign); Ayis Pyrros (Dupagemd); Nasir Siddiqui (Dupagemd); Sanmi Koyejo (University of Illinois, Urbana-Champaign); Zhizhen Zhao (University of Illinois at Urbana-Champaign); David Forsyth (Univeristy of Illinois at Urbana-Champaign); Alexander Schwing (UIUC)*\",Deep Learning,,Oral,3.2,3,A,,,\n3073,Meta-Learning with Differentiable Convex Optimization,\"Kwonjoon Lee (UC San Diego)*; Subhransu Maji (University of Massachusetts, Amherst); Avinash Ravichandran (Amazon); Stefano Soatto (AWS Amazon ML)\",Deep Learning,Representation Learning,Oral,3.2,4,A,,,\n4645,Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach,\"Minyoung Kim (SeoulTech, Rutgers University)*; Pritish Sahu (Rutgers University); Behnam Gholami (Rutgers University); Vladimir Pavlovic (Rutgers University)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval; Statistical Learning\",Oral,3.2,5,A,,,\n6332,Tangent-Normal Adversarial Regularization for Semi-supervised Learning,Bing Yu (Peking University); Jingfeng Wu (Peking University); Jinwen Ma (Peking University); Zhanxing Zhu (Peking University)*,Deep Learning,Representation Learning,Oral,3.2,6,A,,,\n3306,Auto-Encoding Scene Graphs for Descriptive Image Captioning,XU YANG (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Hanwang Zhang (Nanyang Technological University); Jianfei Cai (Nanyang Technological University),Deep Learning,Vision + Language,Oral,3.2,7,A,,,\n6218,\"Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech\",\"Aditya Deshpande (University of Illinois at UC)*; Jyoti Aneja (University of Illinois, Urbana-Champaign); Liwei Wang (Tencent AI Lab); Alexander Schwing (UIUC); David Forsyth (Univeristy of Illinois at Urbana-Champaign)\",Deep Learning,\"Big Data, Large Scale Methods\",Oral,3.2,8,A,,,\n6105,Attention Branch Network: Learning of Attention Mechanism for Visual Explanation,Hiroshi Fukui (Chubu university)*; Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University),Deep Learning,\"Big Data, Large Scale Methods ; Computer Vision Theory; Face, Gesture, and Body Pose ; Recognition:\",Oral,3.2,9,A,,,\n3796,Cascaded Projection: End-to-End Network Compression and Acceleration,Breton L Minnehan (Rochester Institute of Technology)*; Andreas Savakis (Rochester Institute of Technology),Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Oral,3.2,10,A,,,\n5721,DeepCaps : Going Deeper with Capsule Networks,Jathushan Rajasegaran ( University of Moratuwa)*; Vinoj Jayasundara (University of Moratuwa); Sandaru Jayasekara (University of Moratuwa); Hirunima Jayasekara (University of Moratuwa); Ranga Rodrigo (University of Moratuwa); Suranga Seneviratne (University of Sydney),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Oral,3.2,11,A,,,\n6240,DNASNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search,\"Bichen Wu (UC Berkeley)*; Xiaoliang Dai (Princeton University); Peizhao Zhang (Facebook); Yanghan Wang (Facebook); Fei Sun (Facebook); Yiming Wu (Facebook); Yuandong Tian (Facebook); Peter Vajda (Facebook); Yangqing Jia (Facebook); Kurt Keutzer (EECS, UC Berkeley)\",Deep Learning,,Oral,3.2,12,A,,,\n5032,APDrawingGAN: Generating Artistic Portrait Drawings from Face Photos with Hierarchical GANs,Ran Yi (Tsinghua University); Yong-Jin Liu (Tsinghua University)*; Yukun Lai (Cardiff University); Paul Rosin (Cardiff University),Deep Learning,Others,Oral,3.2,13,A,,,\n6431,Constrained Generative Adversarial Networks for Interactive Image Generation,Eric Heim (AFRL/RI)*,Deep Learning,Image and Video Synthesis; Representation Learning,Oral,3.2,14,A,,,\n6807,WarpGAN: Automatic Caricature Generation,Yichun Shi (Michigan State University)*; Debayan Deb (Michigan State University); Anil Jain (Michigan State University),Deep Learning,\"Computer Vision Theory; Face, Gesture, and Body Pose\",Oral,3.2,15,A,,,\n5199,Explainability Methods for Graph Convolutional Neural Networks,\"Phillip Pope (HRL Laboratories, LLC); Soheil Kolouri (HRL Laboratories LLC)*; Mohammad Rostami (HRL Laboratories, LLC); Charles Martin (HRL Laboratories, LLC); Heiko Hoffmann (HRL)\",Deep Learning,Deep Learning,Oral,3.2,16,A,,,\n5502,A Generative Adversarial Density Estimator,Ehsan M Abbasnejad (Adelaide)*; Qinfeng Shi (University of Adelaide); Anton van den Hengel (University of Adelaide); Lingqiao Liu (University of Adelaide),Deep Learning,Representation Learning; Statistical Learning,Oral,3.2,17,A,,,\n5921,SoDeep: a Sorting Deep net to learn ranking loss surrogates,\"Martin Engilberge (Technicolor, Sorbonne universités)*; Louis Chevallier (Technicolor); Patrick Pérez (Valeo.ai); Matthieu Cord (Sorbonne University)\",Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval; Vision + Language\",Oral,3.2,18,A,,,\n89,Pixel Adaptive Convolutional Neural Networks,\"Hang Su (University of Massachusetts, Amherst)*; Varun Jampani (Nvidia Research); Deqing Sun (NVIDIA); Orazio Gallo (NVIDIA Research); Erik Learned-Miller (University of Massachusetts, Amherst); Jan Kautz (NVIDIA)\",Deep Learning,Low-level Vision; Scene Analysis and Understanding,Poster,3.2,19,A,,,\n5526,Single-frame Regularization for Temporally Stable CNNs,Gabriel Eilertsen (Linköping University)*; Jonas Unger (Linköpings universitet); Rafal Mantiuk (University of Cambridge),Deep Learning,Image and Video Synthesis; Optimization Methods,Poster,3.2,20,A,,,\n5560,An End-to-End Network for Generating Social Relationship Graphs,\"Arushi Goel (Agency for Science, Technology and Research)*; Keng Teck Ma (Agency for Science, Technology and Research); Cheston Tan (Institute for Infocomm Research, Singapore)\",Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,21,A,,,\n5561,Meta-learning Convolutional Neural Architectures for Multi-target Concrete Defect Classification with the COncrete DEfect BRidge IMage Dataset,Martin Mundt (FIAS)*; Sagnik Majumder (Goethe University); Sreenivas Narasimha Murali (Karomi); Panagiotis Panetsos (Egnatia Odos A. E.); Visvanathan Ramesh (Univ. Frankfurt),Deep Learning,\"Datasets and Evaluation; Recognition: Detection, Categorization, Retrieval; Representation Learning;\",Poster,3.2,22,A,,,\n5758,ECC: Energy Constrained Deep Neural Network Compression via a Bilinear Regression Model,Haichuan Yang (University of Rochester)*; Yuhao Zhu (University of Rochester); Ji Liu (University of Rochester),Deep Learning,Optimization Methods,Poster,3.2,23,A,,,\n5807,SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity through Low-Bit Quantization,Shijie Cao (Harbin Institute of Technology); Lingxiao Ma (Peking University); Wencong Xiao (Beihang University); Chen Zhang (Microsoft Research Asia)*; Yunxin Liu (MSRA); lintao Zhang (Microsoft Research Asia); Lanshun Nie (Harbin Institute of Technology); Zhi Yang (Peking University),Deep Learning,Vision Applications and Systems,Poster,3.2,24,A,,,\n5842,Defending against adversarial attacks by randomized diversification,\"Olga Taran (Geneva University)*; Shideh Rezaeifar (Geneva University); Taras Holotyak (Geneva University); Slava Voloshynovskiy (CUI, University of Geneva)\",Deep Learning,Others,Poster,3.2,25,A,,,\n5879,\"Adv-GAN: Generator, Discriminator, and Adversarial Attacker\",\"Xuanqing Liu (UCLA Department of Computer Science); Cho-Jui Hsieh (UCLA, Google)*\",Deep Learning,,Poster,3.2,26,A,,,\n5939,Learning From Noisy Labels By Regularized Estimation Of Annotator Confusion,\"Ryutaro Tanno (University College London)*; Ardavan Saeedi (Butterfly Network Inc); Swami Sankaranarayanan (Butterfly Network Inc, NewYork, NY); Daniel Alexander (University College London); Nathan Silberman (Butterfly Network)\",Deep Learning,Others,Poster,3.2,27,A,,,\n6022,Task-Free Continual Learning,Rahaf Aljundi (KU Leuven)*; Klaas Kelchtermans (KULeuven); Tinne Tuytelaars (K.U. Leuven),Deep Learning,Others; Robotics + Driving; Vision Applications and Systems,Poster,3.2,28,A,,,\n6102,Importance Estimation for Neural Network Pruning,Pavlo Molchanov (NVIDIA)*; Arun Mallya (NVIDIA); Stephen Tyree (NVIDIA); Iuri Frosio (NVIDIA); Jan Kautz (NVIDIA),Deep Learning,Optimization Methods,Poster,3.2,29,A,,,\n6103,Detecting Overfitting of Deep Generative Networks via Latent Recovery,Ryan P Webster (UniCaen)*; Julien Rabin (Unicaen); Loic Simon (GREYC/ENSICAEN); Frederic Jurie (University of Caen),Deep Learning,Datasets and Evaluation; Image and Video Synthesis,Poster,3.2,30,A,,,\n6148,Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks,Seungjoo Yoo (Korea University); Hyojin Bahng (Korea University); Sunghyo Chung (Korea University); junsoo lee (Naver Webtoon); Jaehyuk Chang (NAVER Webtoon Corp); Jaegul Choo (Korea University)*,Deep Learning,Computational Photography; Computer Vision Theory; Image and Video Synthesis; Vision + Graphics ; Vi,Poster,3.2,31,A,,,\n6169,Characterizing and Avoiding Negative Transfer,Zirui Wang (Carnegie Mellon University)*; Zihang Dai (Carnegie Mellon University); Barnabas Poczos ( Carnegie Mellon University); Jaime Carbonell (Carnegie Mellon University),Deep Learning,Deep Learning ; Statistical Learning,Poster,3.2,32,A,,,\n6173,Building Efficient Deep Neural Networks with Unitary Group Convolutions,Ritchie Zhao (Cornell University)*; Yuwei Hu (Cornell University); Jordan A Dotzel (Cornell University); Chris De Sa (Cornell); Zhiru Zhang (Cornell Univeristy),Deep Learning,,Poster,3.2,33,A,,,\n6180,Semi-supervised Learning with Graph Learning-Convolutional Networks,Bo Jiang (Anhui University)*; Ziyan Zhang (Anhui University); Doudou Lin (Anhui University); Jin Tang (Anhui University); Bin Luo (Anhui University),Deep Learning,Representation Learning,Poster,3.2,34,A,,,\n6280,Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning,Oleksiy Ostapenko (Humboldt Universität zu Berlin)*; Tassilo Klein (SAP); Mihai O Puscas (University of Trento); Patrick Jähnichen (Humboldt Universität zu Berlin); Moin Nabi (SAP),Deep Learning,\"Big Data, Large Scale Methods ; Recognition: Detection, Categorization, Retrieval; Vision Applicatio\",Poster,3.2,35,A,,,\n6318,AIRD: Adversarial Learning Framework for Image Repurposing Detection,Ayush Jaiswal (University of Southern California)*; Yue Wu (USC ISI); Wael Abd-Almageed (Information Sciences Institute); Iacopo Masi (University of Southern California); Prem Natarajan (USC ISI),Deep Learning,Vision Applications and Systems,Poster,3.2,36,A,,,\n6322,A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations,Saeid Asgari Taghanaki (Simon Fraser University)*; Kumar Abhishek (Simon Fraser University ); Shekoofeh Azizi (University of British Columbia); Ghassan Hamarneh (Simon Fraser University),Deep Learning,\"Medical, Biological and Cell Microscopy\",Poster,3.2,37,A,,,\n6365,Trust Region Based Adversarial Attack on Neural Networks,\"Zhewei Yao (University of California, Berkeley); Amir Gholami (UC Berkeley)*; Peng Xu (Amazon); Kurt Keutzer (EECS, UC Berkeley); Michael Mahoney (\"\"University of California, Berkeley\"\")\",Deep Learning,Others,Poster,3.2,38,A,,,\n6375,Fast Image Inpainting with Parallel Decoding Network,Min-Cheol Sagong (Korea Univ.); Yong-Goo Shin (Korea Univiersity); Seung-Wook Kim (Korea University); Seung Park (Korea University); Sung-Jea Ko (Korea University)*,Deep Learning,Image and Video Synthesis,Poster,3.2,39,A,,,\n6455,Model-blind Video Denoising Via Frame-to-frame Training,\"Thibaud Ehret (CMLA, ENS Cachan)*; Axel Davy (ENS Paris-Saclay); Gabriele Facciolo (ENS Paris-Saclay); Jean-Michel Morel (ENS Paris-Saclay); Pablo Arias (Université Paris-Saclay)\",Deep Learning,Low-level Vision,Poster,3.2,40,A,,,\n6535,End-to-End Efficient Representation Learning via Cascading Combinatorial Optimization,Yeonwoo Jeong (Seoul National University)*; Yoonsung Kim (SNU Machine Learning Lab); Hyun Oh Song (Seoul National University),Deep Learning,Representation Learning,Poster,3.2,41,A,,,\n6570,Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation,\"Fengda Zhu (UTS)*; Linchao Zhu (University of Technology, Sydney); Yi Yang (UTS)\",Deep Learning,Representation Learning; Robotics + Driving,Poster,3.2,42,A,,,\n6629,ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation,Xiaoliang Dai (Princeton University)*; Peizhao Zhang (Facebook); Bichen Wu (UC Berkeley); Hongxu Yin (Princeton University); Fei Sun (Facebook); Yanghan Wang (Facebook); Marat Dukhan (Facebook); Yunqing Hu (Facebook); Yiming Wu (Facebook); Yangqing Jia (Facebook); Peter Vajda (Facebook); Matt Uyttendaele (Facebook); Niraj K Jha (Princeton University),Deep Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,43,A,,,\n6646,Regularizing Activation Distribution for Training Binarized Deep Networks,Ruizhou Ding (Carnegie Mellon University)*; Ting-Wu Chin (Carnegie Mellon University); Diana Marculescu (Carnegie Mellon University); Zeye Liu (Carnegie Mellon University),Deep Learning,Vision Applications and Systems,Poster,3.2,44,A,,,\n6687,Robustness Verification of Classification Deep Neural Networks via Linear Programming,Wang Lin (Zhejiang Sci-Tech University); Zhengfeng Yang (East China Normal University)*; Xin Chen (Nanjing University); Qingye Zhao (Nanjing University); Xiangkun Li (East China Normal University); Zhiming Liu (Southwest University); Jifeng He (East China Normal University),Deep Learning,,Poster,3.2,45,A,,,\n6976,Additive Adversarial Learning for Unbiased Authentication,\"Jian Liang (Cloud and Smart Industries Group, Tencent, Beijing)*; Yuren Cao (Cloud and Smart Industries Group, Tencent, Guangzhou); Chenbin Zhang (University of Electronic Science and Technology of China); Shiyu Chang (IBM Research); Kun Bai (Tencent Inc); Zenglin Xu (University of Electronic Science and Technology of China)\",Deep Learning,Representation Learning,Poster,3.2,46,A,,,\n7079,Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network using Truncated Gaussian Approximation,Zhezhi He (University of Central Florida)*; Deliang Fan (University of Central Florida),Deep Learning,\"Optimization Methods; Recognition: Detection, Categorization, Retrieval\",Poster,3.2,47,A,,,\n7112,Adversarial Defense by Stratified Convolutional Sparse Coding,Hao Su (UCSD)*; Ronald Yu (UCSD); Bo Sun (Peking University); Fangchen Liu (UCSD); Nian-Hsuan Tsai (NTHU),Deep Learning,Representation Learning; Statistical Learning,Poster,3.2,48,A,,,\n5422,Exploring Object Relation in Mean Teacher for Cross-Domain Detection,Qi Cai (University of Science and Technology of China); Yingwei Pan (JD AI Research)*; Chong-Wah Ngo (City University of Hong Kong); Xinmei Tian (USTC); Lingyu Duan (Peking University); Ting Yao (JD AI Research),\"Recognition: Detection, Categorization, Retrieval\",,Poster,3.2,49,B,,,\n5582,Hierarchical Disentanglement of Discriminative Latent Features for Zero-shot Learning,\"Bin Tong (Hitachi Ltd.)*; Chao Wang (Ocean University of China); Martin Klinkigt (Hitachi, Ltd.); Yoshiyuki Kobayashi (Hitachi); Yuichi Nonaka (Hitachi)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Representation Learning,Poster,3.2,50,B,,,\n5694,R2GAN: Cross-modal Recipe Retrieval with Generative Adversarial Network,Bin Zhu (City University of Hong Kong)*; Jingjing Chen (City University of HongKong); Yanbin Hao (City University of Hong Kong); Chong-Wah Ngo (City University of Hong Kong),\"Recognition: Detection, Categorization, Retrieval\",Image and Video Synthesis; Representation Learning; Vision + Language,Poster,3.2,51,B,,,\n5745,Rethinking Knowledge Graph Propagation for Zero-Shot Learning,\"Michael C. Kampffmeyer (UiT The Arctic University of Norway)*; Yinbo Chen (Tsinghua University); Xiaodan Liang (Sun Yat-sen University); Hao Wang (Massachusetts Institute of Technology); Yujia Zhang (Institue of Automation, Chinese Academy of Sciences); Eric Xing (Petuum Inc. and CMU)\",\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.2,52,B,,,\n5863,Learning to Learn Image Classifiers with Visual Analogy,Linjun Zhou (Tsinghua University)*; Peng Cui (Tsinghua University); Shiqiang Yang (Tsinghua University); Wenwu Zhu (Tsinghua Unversity); Qi Tian (Huawei Noah’s Ark Lab),\"Recognition: Detection, Categorization, Retrieval\",Computer Vision Theory; Deep Learning,Poster,3.2,53,B,,,\n5895,Where’s Wally now? Deep Generative and Discriminative Embeddings for Novelty Detection,Philippe Burlina (JHU/APL/CS/SOM)*; Neil Joshi (Johns Hopkins U.); i_jeng Wang (Johns Hopkins U.),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning ; Scene Analysis and Understanding; Statistical Learning,Poster,3.2,54,B,,,\n5972,Weakly Supervised Image Classification through Noise Regularization,Mengying Hu (Institute of Computing Technology Chinese Academy of Sciences); Hu Han (Chinese Academy of Sciences)*; Shiguang Shan (Chinese Academy of Sciences); Xilin Chen (China),\"Recognition: Detection, Categorization, Retrieval\",Statistical Learning,Poster,3.2,55,B,,,\n5986,Data-Driven Neuron Allocation for Scale Aggregation Networks,Yi Li (SenseTime Research)*; Zhanghui Kuang (Sensetime Ltd.); Yimin Chen (sensetime); Wei Zhang (SenseTime Research),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.2,56,B,,,\n6075,Graphical Contrastive Losses for Scene Graph Generation,Ji Zhang (Rutgers University)*; Kevin Shih (NVIDIA); Andrew Tao (NVIDIA); Bryan Catanzaro (NVIDIA); Ahmed Elgammal (Rutgers University),\"Recognition: Detection, Categorization, Retrieval\",Scene Analysis and Understanding,Poster,3.2,57,B,,,\n6203,Deep Transfer Learning for Multiple Class Novelty Detection,Pramuditha Perera (Johns Hopkins University)*; Vishal Patel (Johns Hopkins University),\"Recognition: Detection, Categorization, Retrieval\",Representation Learning; Vision Applications and Systems,Poster,3.2,58,B,,,\n6347,QATM: Quality-Aware Template Matching For Deep Learning,Jiaxin Cheng (USC Information Sciences Institute)*; Yue Wu (USC ISI); Wael Abd-Almageed (Information Sciences Institute); Prem Natarajan (USC ISI),\"Recognition: Detection, Categorization, Retrieval\",,Poster,3.2,59,B,,,\n6528,Retrieval-augmented Convolutional Neural Networks against adversarial examples,Junbo Zhao (New York University)*; Kyunghyun Cho (New York University),\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.2,60,B,,,\n6538,Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images,Hao Wang (Singapore Management University)*; Doyen Sahoo (Singapore Management University); Chenghao Liu (Singapore Management University); Ee-peng Lim (Singapore Management University); Steven Hoi (SMU),\"Recognition: Detection, Categorization, Retrieval\",\"Big Data, Large Scale Methods ; Medical, Biological and Cell Microscopy; Others\",Poster,3.2,61,B,,,\n7022,FastDraw: Lane Detection by a Sequential Prediction Network,Jonah Philion (ISEE Inc.)*,\"Recognition: Detection, Categorization, Retrieval\",Deep Learning,Poster,3.2,62,B,,,\n7086,Weakly Supervised Video Moment Retrieval From Text Queries,\"Niluthpol c Mithun (UC Riverside)*; Sujoy Paul (UC Riverside); Amit Roy-Chowdhury (University of California, Riverside, USA )\",\"Recognition: Detection, Categorization, Retrieval\",Vision + Language,Poster,3.2,63,B,,,\n5499,Scale-Aware Multi-Level Guidance for Interactive Instance Segmentation,Soumajit Majumder (University of Bonn)*; Angela Yao (National University of Singapore),\"Segmentation, Grouping and Shape\",Deep Learning ; Vision Applications and Systems,Poster,3.2,64,C,,,\n5805,Greedy Structure Learning of Hierarchical Compositional Models,Adam Kortylewski (Johns Hopkins University)*; Aleksander Wieczorek (University of Basel); Mario Wieser (University of Basel); Clemens Blumer (University of Basel); Andreas Morel-Forster (University of Basel); Sonali Parbhoo (University of Basel); Volker Roth (University of Basel); Thomas Vetter (University of Basel),\"Segmentation, Grouping and Shape\",Representation Learning; Visual Reasoning,Poster,3.2,65,C,,,\n6354,Interactive Full Image Segmentation,Eirikur Agustsson (Google)*; Jasper Uijlings (Google Research); Vittorio Ferrari (Google Research),\"Segmentation, Grouping and Shape\",,Poster,3.2,66,C,,,\n6376,Learning Active Contour Models for Medical Image Segmentation,Xu Chen (University of Liverpool); Bryan M. Williams (University of Liverpool); Srinivasa Vallabhaneni (University of Liverpool and Royal Liverpool & Broadgreen University Hospitals NHS Trust); Gabriela Czanner (Liverpool John Moores University); Rachel Williams (University of Liverpool); Yalin Zheng (University of Liverpool)*,\"Segmentation, Grouping and Shape\",Deep Learning,Poster,3.2,67,C,,,\n6887,Customizable Architecture Search for Semantic Segmentation,Yiheng Zhang (University of Science and Technology of China); Zhaofan Qiu (University of Science and Technology of China); Jingen Liu (JD); Ting Yao (JD AI Research)*; Dong Liu (University of Science and Technology of China); Tao Mei (AI Research of JD.com),\"Segmentation, Grouping and Shape\",,Poster,3.2,68,C,,,\n5468,Local Features and Visual Words Emerge in Activations,\"Oriane Siméoni (Inria)*; Yannis Avrithis (Inria); Ondrej Chum (Vision Recognition Group, Czech Technical University in Prague)\",Computer Vision Theory,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,69,D,,,\n5509,Hyperspectral Image Super-Resolution with Optimized RGB Guidance,Ying Fu (Beijing Institute of Technology)*; Tao Zhang (Beijing Institute of Technology); Yinqiang Zheng (National Institute of Informatics); Debing Zhang (DeepGlint); Hua Huang (Beijing Institute of Technology),Physics-based Vision and Shape-from-X,Computational Photography; Deep Learning ; Low-level Vision,Poster,3.2,70,D,,,\n5671,Domain-Aware Generalized Zero-Shot Learning,Yuval Atzmon (Bar-Ilan University)*; Gal CHECHIK (Bar Ilan University),Statistical Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,71,D,,,\n5748,PMS-Net: Robust Haze Removal Based on Patch Map for Singe Images,Wei-Ting Chen (National Taiwan University)*; Jian-Jiun Ding (Nil); Sy-Yen Kuo (National Taiwan University),Computer Vision Theory,Computational Photography,Poster,3.2,72,D,,,\n5905,Deep Spherical Hashing,Sepehr Eghbali (University of Waterloo)*; Ladan Tahvildari (University of Waterloo),\"Big Data, Large Scale Methods\",Deep Learning,Poster,3.2,73,D,,,\n6117,Large-scale interactive object segmentation with human annotators,Rodrigo Benenson (Google)*; Stefan Popov (Google); Vittorio Ferrari (Google Research),Datasets and Evaluation,\"Big Data, Large Scale Methods ; Others; Segmentation, Grouping and Shape; Vision Applications and Sy\",Poster,3.2,74,D,,,\n6327,A Poisson-Gaussian Denoising Dataset for Real Fluorescence Microscopy Images,Yide Zhang (University of Notre Dame)*; Yinhao Zhu (University of Notre Dame); Evan Nichols (University of Notre Dame); Qingfei Wang (University of Notre Dame); Siyuan Zhang (University of Notre Dame); Cody Smith (University of Notre Dame); Scott Howard (University of Notre Dame),Datasets and Evaluation,\"Medical, Biological and Cell Microscopy\",Poster,3.2,75,D,,,\n6484,Task Agnostic Meta-Learning for Few-Shot Learning,Muhammad Abdullah Jamal (University of Central Florida)*; Guo-Jun Qi (University of Central Florida),Statistical Learning,Deep Learning ; Statistical Learning,Poster,3.2,76,D,,,\n6589,Progressive Ensemble Networks with Adaptive Label Embeddings for Zero Shot Recognition,Meng Ye (Temple University); Yuhong Guo (Carleton University)*,Statistical Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,77,D,,,\n6749,Direct Object Recognition Without Line-of-sight Using Optical Coherence,Xin Lei (Coherent AI LLC); Liangyu He (Coherent AI LLC); Ken Xingze Wang (Coherent AI LLC)*; Xinggang Wang (Huazhong Univ. of Science and Technology); Yixuan Tan (University of Wisconsin-Madison); Yihan Du (Huazhong Univ. of Science and Technology); Shanhui Fan (Stanford University); Zongfu Yu (University of Wisconsin Madison),Physics-based Vision and Shape-from-X,,Poster,3.2,78,D,,,\n6981,Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning,Mahdi S Hosseini (University of Toronto)*; Gabriel Tse (University of Toronto); Michael Tang (University of Toronto); Jun Deng (University of Toronto); Sajad Norouzi (University of Toronto); Konstantinos N Plataniotis (UofT); Savvas Damaskinos (Huron Digital Pathology); Corwyn Rowsell (st michael hospital); Lyndon Chan (University of Toronto),Datasets and Evaluation,Deep Learning,Poster,3.2,79,D,,,\n5565,Perturbation Analysis of the 8-Point Algorithm: a Case Study for Wide FoV Cameras,Thiago L T da Silveira (Federal University of Rio Grande do Sul)*; Claudio R Jung (UFRGS),3D from Multiview and Sensors,Motion and Tracking,Poster,3.2,80,E,,,\n6014,Robustness of 3D Deep Learning in an Adversarial Setting,Matthew R Wicker (University of Oxford)*; Marta Kwiatkowska (University of Oxford),3D from Multiview and Sensors,Computer Vision Theory,Poster,3.2,81,E,,,\n6328,SceneCode: Monocular Dense Semantic Reconstruction using Learned Encoded Scene Representations,Shuaifeng Zhi (Imperial College London)*; Michael Bloesch (Imperial College London); Stefan Leutenegger (Imperial College London); Andrew Davison (Imperial College London),3D from Multiview and Sensors,3D from Single Image; Scene Analysis and Understanding,Poster,3.2,82,E,,,\n6433,StereoDRNet: Dilated Residual StereoNet,\"Rohan Chabra (University of North Carolina at Chapel Hill)*; Julian Straub (Facebook Reality Labs); Christopher Sweeny (Facebook Reality Labs); Richard Newcombe (\"\"Facebook / Occulus, USA\"\"); Henry Fuchs (unc)\",3D from Multiview and Sensors,Deep Learning,Poster,3.2,83,E,,,\n6633,The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation,\"Dylan Campbell (Australian National University)*; Lars Petersson (Data61/CSIRO); Laurent Kneip (ShanghaiTech University); HONGDONG LI (Australian National University, Australia); Stephen Gould (Australian National University, Australia)\",3D from Multiview and Sensors,3D from Single Image; Optimization Methods,Poster,3.2,84,E,,,\n88,Learning joint reconstruction of hands and manipulated objects,Yana Hasson (Inria)*; Gul Varol (INRIA); Dimitrios Tzionas (Max Planck Institute for Intelligent Systems); Igor Kalevatykh (INRIA Paris); Michael J. Black (Max Planck Institute for Intelligent Systems); Ivan Laptev (INRIA Paris); Cordelia Schmid (INRIA),3D from Single Image,\"Deep Learning ; Face, Gesture, and Body Pose\",Poster,3.2,85,F,,,\n5538,Deep Single Image Camera Calibration with Radial Distortion,\"Manuel Lopez-Antequera (Mapillary)*; Roger Marí Molas (CMLA, ENS Cachan); Pau Gargallo (Mapillary); Yubin Kuang (Mapillary AB); Javier Gonzalez-Jimenez (University of Malaga); Gloria Haro (Universitat Pompeu Fabra)\",3D from Single Image,Deep Learning ; Vision + Graphics ; Vision Applications and Systems,Poster,3.2,86,F,,,\n5655,CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth Prediction,José M Fácil (University of Zaragoza)*; Benjamin Ummenhofer (Intel Labs); Huizhong Zhou (The University of Freiburg); Luis Montesano (University of Zaragoza; Bitbrain); Thomas Brox (University of Freiburg); Javier Civera (Universidad de Zaragoza),3D from Single Image,,Poster,3.2,87,F,,,\n5730,Translate-to-Recognize Networks for RGB-D Indoor Scene Recognition,Dapeng Du (Nanjing University); Huiling Wang (Nanjing University); Kai Zhao (Nanjing University); Gangshan Wu (Nanjing University); Limin Wang (Nanjing University)*,RGBD sensors and analytics,\"Recognition: Detection, Categorization, Retrieval; Representation Learning; Scene Analysis and Under\",Poster,3.2,88,F,,,\n6024,Re-Identification Supervised 3D Texture Generation,\"Jian Wang (State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences & University of Chinese Academy of Sciences)*; Yunshan Zhong (Peking University); Yachun Li (Zhejiang University); Chi Zhang (Megvii Inc.); Yichen Wei (Megvii Research Shanghai)\",3D from Single Image,Image and Video Synthesis,Poster,3.2,89,F,,,\n6276,Action4D: Online Action Recognition in the Crowd and Clutter,Quanzeng You (Microsoft); Hao Jiang (Microsoft)*,RGBD sensors and analytics,Action Recognition ; Deep Learning,Poster,3.2,90,F,,,\n6696,Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction,Jason Ku (University of Toronto)*; Alex D Pon (University of Toronto); Steven L Waslander (University of Toronto),3D from Single Image,Deep Learning ; Robotics + Driving; Scene Analysis and Understanding,Poster,3.2,91,F,,,\n4,High-Quality Face Capture Using Anatomical Muscles,Michael H Bao (Stanford University)*; Matthew D Cong (Industrial Light & Magic); Stephane Grabli (Industrial Light & Magic); Ronald Fedkiw (Stanford),\"Face, Gesture, and Body Pose\",3D from Single Image; Physics-based Vision and Shape-from-X; Vision + Graphics,Oral,3.2,92,G,,,\n2408,FML: Face Model Learning from Videos,Ayush Tewari (Max Planck Institute for Informatics)*; Florian Bernard (Max Planck Institute for Informatics); Pablo Garrido (Technicolor); Gaurav Bharaj (Technicolor); Mohamed Elgharib (Max Planck Institute for Informatics); Hans-Peter Seidel (Max Planck Institute for Informatics); Patrick Pérez (Valeo.ai); Michael Zollhoefer (Stanford University); Christian Theobalt (MPI Informatik),\"Face, Gesture, and Body Pose\",3D from Single Image; Vision + Graphics,Oral,3.2,93,G,,,\n4483,AdaScale: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations,\"Xiao Zhang (Chinese University of Hong Kong); Rui Zhao (SenseTime Group Limited); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*; Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongsheng Li (Chinese University of Hong Kong)\",\"Face, Gesture, and Body Pose\",\"Biometrics ; Deep Learning ; Recognition: Detection, Categorization, Retrieval; Representation Learn\",Oral,3.2,94,G,,,\n387,3D Hand Shape and Pose Estimation from a Single RGB Image,\"Liuhao Ge (Nanyang Technological University)*; Zhou Ren (Snap Inc.); Yuncheng Li (Snap); Zehao Xue (Snap Inc.); Yingying Wang (Snap Inc.); Jianfei Cai (Nanyang Technological University); Junsong Yuan (\"\"State University of New York at Buffalo, USA\"\")\",\"Face, Gesture, and Body Pose\",3D from Single Image,Oral,3.2,95,G,,,\n647,3D hand shape and pose from images in the wild,Adnane Boukhayma (University of Oxford)*; Rodrigo de Bem (University of Oxford); Philip Torr (University of Oxford),\"Face, Gesture, and Body Pose\",3D from Single Image,Oral,3.2,96,G,,,\n843,Self supervised 3D hand pose estimation,Chengde Wan (ETHZ)*; Thomas Probst (ETH Zurich); Luc Van Gool (ETH Zurich); Angela Yao (National University of Singapore),\"Face, Gesture, and Body Pose\",RGBD sensors and analytics,Oral,3.2,97,G,,,\n1497,CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark,Jiefeng Li (Shanghai Jiao Tong University); Can Wang (SJTU); Hao Zhu (Shanghai Jiao Tong University); Yihuan Mao (Tsinghua Univerisity); Hao-Shu Fang (SJTU); Cewu Lu (Shanghai Jiao Tong University)*,\"Face, Gesture, and Body Pose\",\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Oral,3.2,98,G,,,\n3233,Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in A Triadic Interaction,\"Hanbyul Joo (CMU)*; Tomas Simon (CMU); Mina Cikara (Department of Psychology, Harvard); Yaser Sheikh (CMU)\",\"Face, Gesture, and Body Pose\",Datasets and Evaluation,Oral,3.2,99,G,,,\n6947,\"Synergistic, Part-Based 3D Human Reconstruction In-The-Wild\",Alp Guler (Imperial College London)*; Iasonas Kokkinos (UCL),\"Face, Gesture, and Body Pose\",3D from Single Image; Deep Learning,Oral,3.2,100,G,,,\n2239,Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation,Xipeng Chen (Sun Yat-sen University); Kwan-Yee Lin (Peking university); Wentao Liu (Sensetime); Chen Qian (SenseTime); Liang Lin (Sun Yat-sen University)*,\"Face, Gesture, and Body Pose\",Deep Learning,Oral,3.2,101,G,,,\n3999,In the Wild Human Pose Estimation using Explicit 2D Features and Intermediate 3D Representations,\"Ikhsanul Habibie (Max Planck Institute for Informatics)*; Weipeng Xu (MPII); Dushyant Mehta (MPI Informatics); Gerard Pons-Moll (MPII, Germany); Christian Theobalt (MPI Informatik)\",\"Face, Gesture, and Body Pose\",3D from Single Image; Motion and Tracking,Oral,3.2,102,G,,,\n33,DensePose-Slim: Cheaper Learning from Motion Cues,Natalia Neverova (Facebook AI Research)*; James Thewlis (University of Oxford); Alp Guler (Imperial College London); Iasonas Kokkinos (UCL); Andrea Vedaldi (Oxford University),\"Face, Gesture, and Body Pose\",Datasets and Evaluation; Deep Learning ; Motion and Tracking,Oral,3.2,103,G,,,\n2859,Twin-Cycle Autoencoder: Self-supervised Representation Learning from Entangled Movement for Facial Action Unit Detection,\"Yong Li (Institute of Computing Technology, Chinese Academy of Sciences); Jiabei Zeng (Institute of Computing Technology, Chinese Academy on Sciences)*; Shiguang Shan (Chinese Academy of Sciences); Xilin Chen (China)\",\"Face, Gesture, and Body Pose\",Deep Learning ; Representation Learning,Oral,3.2,104,G,,,\n4558,Combining 3D Morphable Models: A Largescale Face-and-Head Model,\"Stylianos Ploumpis (Imperial College London)*; Haoyang Wang (Imperial College London); Nick E. Pears (University of York, UK); William Smith (University of York); Stefanos Zafeiriou (Imperial College Londong)\",\"Face, Gesture, and Body Pose\",3D from Single Image; Vision + Graphics,Oral,3.2,105,G,,,\n4364,Boosting Local Shape Matching for Dense 3D Face Correspondence,Zhenfeng Fan (The Chinese academy of science)*; hu xiyuan (The Chinese academy of science); Chen Chen (The Chinese academy of science); peng silong (The Chinese academy of science),\"Face, Gesture, and Body Pose\",3D from Multiview and Sensors; 3D from Single Image; Physics-based Vision and Shape-from-X; RGBD sen,Oral,3.2,106,G,,,\n2886,Unsupervised Part-Based Disentangling of Object Shape and Appearance,Dominik Lorenz (Heidelberg University); Leonard Bereska (Heidelberg University); Timo Milbich (Heidelberg University)*; Bjorn Ommer (Heidelberg University),\"Face, Gesture, and Body Pose\",Image and Video Synthesis,Oral,3.2,107,G,,,\n2922,\"Monocular Total Capture: Posing Face, Body, and Hands in the Wild\",Donglai Xiang (Carnegie Mellon University)*; Hanbyul Joo (CMU); Yaser Sheikh (CMU),\"Face, Gesture, and Body Pose\",3D from Single Image; Vision + Graphics,Oral,3.2,108,G,,,\n3128,\"Expressive Body Capture: 3D Hands, Face, and Body from a Single Image\",Georgios Pavlakos (University of Pennsylvania)*; Michael J. Black (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Vasileios Choutas (Max Planck Institute for Intelligent Systems); Nima Ghorbani (Max Planck Institute Tübingen); Ahmed A A Osman (Max Planck Institute for Intelligent Systems); Dimitrios Tzionas (Max Planck Institute for Intelligent Systems),\"Face, Gesture, and Body Pose\",3D from Single Image,Oral,3.2,109,G,,,\n5458,Attribute-aware Face Aging with Wavelet-based Generative Adversarial Networks,\"Yunfan Liu (Institute of Automation, Chinese Academy of Sciences); Qi Li (CASIA); Zhenan Sun (Chinese of Academy of Sciences)*\",\"Face, Gesture, and Body Pose\",Biometrics,Poster,3.2,110,G,,,\n5518,Noise-Tolerant Paradigm for Training Face Recognition CNNs,Wei Hu ( Beijing University of Chemical Technology); Yangyu Huang (Yunshitu Corp.)*; Fan Zhang (Beijing University of Chemical Technology); Ruirui Li (Beijing University of Chemical Technology),\"Face, Gesture, and Body Pose\",\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,3.2,111,G,,,\n5529,Low-Rank Laplacian-Uniform Mixed Model for Robust Face Recognition,Jiayu Dong (Sun Yat-sen University); Huicheng Zheng (Sun Yat-sen University)*; Lina Lian (Sun Yat-sen University),\"Face, Gesture, and Body Pose\",Computer Vision Theory,Poster,3.2,112,G,,,\n6040,Generalizing Eye Tracking with Bayesian Adversarial Learning,Kang Wang (RPI)*; Rui Zhao (Rensselaer Polytechnic Institute); Hui Su (IBM); Qiang Ji (Rensselaer Polytechnic Institute),\"Face, Gesture, and Body Pose\",,Poster,3.2,113,G,,,\n6085,Local Relationship Learning with Person-specific Regularization for Facial Action Unit Detection,\"Xuesong Niu (Institute of Computing Technology, CAS); Hu Han (Chinese Academy of Sciences)*; Songfan Yang (100tal); Yan Huang (TAL Education Group); Shiguang Shan (Chinese Academy of Sciences)\",\"Face, Gesture, and Body Pose\",Biometrics,Poster,3.2,114,G,,,\n6096,Point-to-Pose Voting based Hand Pose Estimation using Residual Permutation Equivariant Layer,Shile Li (Technical University of Munich)*; Dongheui Lee (Technical University of Munich),\"Face, Gesture, and Body Pose\",\"3D from Single Image; Deep Learning ; Segmentation, Grouping and Shape\",Poster,3.2,115,G,,,\n6510,Improving User-Specific Gaze Estimation via Gaze Redirection Synthesis,\"Yu Yu (Idiap, and EPFL)*; Gang Liu (Idiap Research Institute); Jean-Marc ODOBEZ (IDIAP/EPFL, SWITZERLAND)\",\"Face, Gesture, and Body Pose\",Image and Video Synthesis; Vision Applications and Systems,Poster,3.2,116,G,,,\n6610,AdaptiveFace: Adaptive Margin and Sampling for Face Recognition,\"Hao Liu (NLPR, CASIA); Xiangyu Zhu (NLPR); Zhen Lei (NLPR, CASIA, China)*; Stan Li (National Lab. of Pattern Recognition, China)\",\"Face, Gesture, and Body Pose\",\"Biometrics ; Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,3.2,117,G,,,\n6817,Disentangled Representation Learning for 3D Face Shape,Zi-Hang Jiang (University of Science and Technology of China); Qianyi Wu (University of Science and Technology of China); Keyu Chen (University of Science and Technology of China); Juyong Zhang (University of Science and Technology of China)*,\"Face, Gesture, and Body Pose\",Biometrics,Poster,3.2,118,G,,,\n6834,Self-supervised Fitting of Articulated Meshes to Point Clouds,Chun-Liang Li (Carnegie Mellon University)*; Tomas Simon (Carnegie Mellon University); Jason Saragih (Oculus); Barnabas Poczos ( Carnegie Mellon University); Yaser Sheikh (Facebook Reality Labs),\"Face, Gesture, and Body Pose\",3D from Multiview and Sensors; Deep Learning,Poster,3.2,119,G,,,\n6964,PifPaf: Association Fields for Human Pose Estimation,Sven Kreiss (EPFL)*; Lorenzo Bertoni (EPFL); Alexandre Alahi (EPFL),\"Face, Gesture, and Body Pose\",Deep Learning,Poster,3.2,120,G,,,\n5442,TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection,Lin Song (Xi'an Jiaotong University)*; Shiwei Zhang (Huazhong University of Science and Technology); Gang Yu (Face++); Hongbin Sun (Xi'an Jiaotong University),Video Analytics,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,121,H,,,\n5576,Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos,Romero Morais (Deakin University)*; Vuong Le (Deakin University); Budhaditya Saha (Deakin University); Truyen Tran (Deakin University); Moussa Reda Mansour (iCetana); Svetha Venkatesh (Deakin University),Video Analytics,\"Deep Learning ; Face, Gesture, and Body Pose ; Motion and Tracking; Scene Analysis and Understanding\",Poster,3.2,122,H,,,\n5661,Local Temporal Bilinear Pooling for Fine-grained Action Parsing,\"Yan Zhang (Institute of Neural Information Processing, Ulm University)*; Siyu Tang (MPI for Intelligent Systems); Krikamol Muandet (Max Planck Institute for Intelligent Systems); Christian Jarvers (Ulm University); Heiko Neumann (Ulm University)\",Action Recognition,Statistical Learning,Poster,3.2,123,H,,,\n5677,Improving Action Localization by Progressive Cross-stream Cooperation,RUI SU (the University of Sydney); Wanli Ouyang (The University of Sydney); Luping Zhou (University of Sydney); Dong Xu (University of Sydney)*,Video Analytics,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,124,H,,,\n5734,Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition,\"Lei Shi (Institute of Automation，Chinese Academy of Sciences )*; Yifan o Zhang (Institute of Automation, Chinese Academy of Sciences); Jian Cheng (\"\"Chinese Academy of Sciences, China\"\"); Hanqing Lu (NLPR, Institute of Automation, CAS)\",Action Recognition,Deep Learning,Poster,3.2,125,H,,,\n5851,A neural network based on SPD manifold learning for skeleton-based hand gesture recognition,\"Xuan Son Nguyen (Ensicaen)*; luc brun (GREYC-ENSICAEN); Olivier Lézoray (University of Caen Nodmandy); Sébastien Bougleux (Normadie Univ, UNICAEN, ENSICAEN, CNRS, GREYC)\",Action Recognition,\"Deep Learning ; Face, Gesture, and Body Pose\",Poster,3.2,126,H,,,\n5874,Large-scale weakly-supervised pre-training for video action recognition,Dhruv Mahajan (Facebook); Deepti Ghadiyaram (Facebook)*; Du Tran (Facebook Research),Action Recognition,\"Big Data, Large Scale Methods ; Datasets and Evaluation; Deep Learning ; Representation Learning\",Poster,3.2,127,H,,,\n6155,Learning Spatio-Temporal Representation with Local and Global Diffusion,Zhaofan Qiu (University of Science and Technology of China); Ting Yao (JD AI Research)*; Chong-Wah Ngo (City University of Hong Kong); Xinmei Tian (USTC); Tao Mei (AI Research of JD.com),Video Analytics,,Poster,3.2,128,H,,,\n6348,Unsupervised learning of action classes with continuous temporal embedding,Anna Kukleva (University of Bonn); Hilde Kuehne (University of Bonn)*; Fadime Sener (University of Bonn); Jürgen Gall (University of Bonn),Action Recognition,Video Analytics,Poster,3.2,129,H,,,\n7172,Double Nuclear Norm based Low Rank Representation on Grassmann Manifolds for Clustering,\"Xinglin Piao (Dalian University of Technology)*; Yongli Hu (Beijing University of Technology); Junbin Gao (University of Sydney, Australia); Yanfengn Sun (Beijing University of Technology); Baocai Yin Yin (Dalian University of Technology)\",Video Analytics,\"Representation Learning; Segmentation, Grouping and Shape\",Poster,3.2,130,H,,,\n5803,SR-LSTM: State Refinement for LSTM towards Pedestrian Trajectory Prediction,Pu Zhang (Xi’an Jiaotong University); Wanli Ouyang (The University of Sydney); Pengfei Zhang (Xi'an Jiaotong University); Jianru Xue (Xi'an Jiaotong University)*; Nanning Zheng (Xi'an Jiaotong University),Motion and Tracking,Deep Learning,Poster,3.2,131,I,,,\n6115,Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes,\"Yiran Zhong (Australian National University)*; Pan Ji (NEC Laboratories America); Yuchao Dai (Northwestern Polytechnical University); Jianyuan Wang (Australian National University); HONGDONG LI (Australian National University, Australia)\",Motion and Tracking,3D from Multiview and Sensors; Robotics + Driving,Poster,3.2,132,I,,,\n6223,An Efficient Schmidt-EKF for 3D Visual-Inertial SLAM,Patrick Geneva (University of Delaware)*; James Maley (University of Delaware); Guoquan Huang (University of Delaware),Motion and Tracking,3D from Multiview and Sensors; Optimization Methods; Robotics + Driving; Vision Applications and Sys,Poster,3.2,133,I,,,\n6640,A Neural Temporal Model for Human Motion Prediction,\"Anand Gopalakrishnan (Pennsylvania State University)*; Ankur Mali (Penn State); Dan Kifer (\"\"Pennsylva State Univ., USA\"\"); Lee Giles (Pennsylvania State); Alexander Ororbia (Rochester Institute of Techonology)\",Motion and Tracking,Deep Learning,Poster,3.2,134,I,,,\n6954,Convolutional Spatial Fusion for Multi-Agent Trajectory Prediction,\"Tianyang Zhao (Peking University)*; Yifei Xu (UCLA); Mathew Monfort (iSee); Wongun Choi (iSee); Chris Baker (iSee); Yibiao Zhao (iSee); Yizhou Wang (PKU); Ying Nian Wu (University of California, Los Angeles)\",Motion and Tracking,Deep Learning ; Robotics + Driving; Scene Analysis and Understanding; Vision Applications and System,Poster,3.2,135,I,,,\n5430,Coordinate-based Texture Inpainting for Pose-Guided Image Generation,Artur Grigorev (Samsung); Artem Sevastopolsky (Samsung); Alexander Vakhitov (Samsung AI Research Center); Victor Lempitsky (Samsung)*,Image and Video Synthesis,Deep Learning ; Vision + Graphics,Poster,3.2,136,J,,,\n5596,Stable Generative Adversarial Training via Data Distribution Filtering,Simon Jenni (Universität Bern)*; Paolo Favaro (University of Bern),Image and Video Synthesis,Deep Learning,Poster,3.2,137,J,,,\n5940,Self-Supervised Generative Adversarial Networks,Ting Chen (UCLA); Xiaohua Zhai (Google Brain); Marvin Ritter (Google Brain); Mario Lucic (Google Brain)*; Neil Houlsby (Google),Image and Video Synthesis,Deep Learning ; Representation Learning,Poster,3.2,138,J,,,\n5947,Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture,\"Ning Yu (Max Planck Institute for Informatics)*; Connelly Barnes (University of Virginia); Eli Shechtman (Adobe Research, US); Mike Lukac (Adobe Research); Sohrab Amirghodsi (Adobe Research)\",Image and Video Synthesis,Vision + Graphics,Poster,3.2,139,J,,,\n6172,Object-driven Text-to-Image Synthesis via Adversarial Training,\"Wenbo Li (\"\"University at Albany, SUNY\"\")*; Pengchuan Zhang (Microsoft Research AI); Lei Zhang (Microsoft Research); Qiuyuan Huang (Microsoft Research AI); Xiaodong He (JD AI Research); Siwei Lyu (University at Albany); Jianfeng Gao (Microsoft Research)\",Image and Video Synthesis,Vision + Language,Poster,3.2,140,J,,,\n6289,Zoom-In-to-Check: Boosting Video Interpolation via Instance-level Discrimination,Liangzhe Yuan (University of Pennsylvania)*; Yibo Chen (University of Pennsylvania); Hantian Liu (University of Pennsylvania); Tao Kong (Tsinghua); Jianbo Shi (University of Pennsylvania),Image and Video Synthesis,Deep Learning ; Motion and Tracking; Scene Analysis and Understanding,Poster,3.2,141,J,,,\n6308,Disentangling Latent Space for VAE by Label Relevant/Irrelevant Dimensions,Zhilin Zheng (East China Normal University); Li Sun (East China Normal University)*,Image and Video Synthesis,\"Deep Learning ; Face, Gesture, and Body Pose\",Poster,3.2,142,J,,,\n5055,Spectral Reconstruction from Dispersive Blur: Approaching Full Light Throughput Spectral Imager,Yuanyuan Zhao (Nanjing University); Hui Guo (Nanjing University); Xun Cao (Nanjing University); Zhan Ma (Nanjing University); Tao Yue (Nanjing University); Xuemei Hu (Nanjing University)*,Computational Photography,,Poster,3.2,143,K,,,\n5370,Quasi-Unsupervised Color Constancy,Simone Bianco (University of Milano Bicocca)*; Claudio Cusano (University of Pavia),Computational Photography,Deep Learning ; Low-level Vision,Poster,3.2,144,K,,,\n5571,Deep Defocus Map Estimation using Domain Adaptation,Junyong Lee (POSTECH); Sungkil Lee (Sungkyunkwan University); Sunghyun Cho (DGIST); Seungyong Lee (POSTECH)*,Computational Photography,Low-level Vision,Poster,3.2,145,K,,,\n6650,Using Unknown Occluders to Recover Hidden Scenes,Adam B Yedidia (Massachusetts Institute of Technology)*; Manel Baradad Jurjo (MIT); Christos Thrampoulidis (Massachusetts Institute of Technology); Bill Freeman (MIT); Gregory W Wornell (MIT),Computational Photography,Scene Analysis and Understanding; Vision Applications and Systems,Poster,3.2,146,K,,,\n707,Neural RGB -> D Sensing: Depth and Uncertainty from a Video Camera,Chao Liu (Carnegie Mellon University); Jinwei Gu (NVIDIA)*; Kihwan Kim (NVIDIA); Srinivasa G Narasimhan (Carnegie Mellon University); Jan Kautz (NVIDIA),Low-level Vision,3D from Multiview and Sensors; 3D from Single Image; Computational Photography,Oral,3.2,147,L,,,\n1006,DAVANet: Stereo Deblurring with View Aggregation,\"Shangchen Zhou (Sensetime Research)*; Jiawei Zhang (Sensetime Research); Jimmy Ren (SenseTime Research); Wangmeng Zuo (Harbin Institute of Technology, China); Haozhe Xie (Harbin Institute of Technology); Jinshan Pan (Nanjing University of Science and Technology)\",Low-level Vision,Deep Learning ; RGBD sensors and analytics,Oral,3.2,148,L,,,\n3657,DVC: An End-to-end Deep Video Compression Framework,Guo Lu (Shanghai Jiao Tong University)*; Wanli Ouyang (The University of Sydney); Dong Xu (University of Sydney); Chunlei Cai (Shanghai Jiao Tong University); Xiaoyun Zhang (Shanghai Jiao Tong University); Zhiyong Gao (Shanghai Jiao Tong University),Low-level Vision,Vision Applications and Systems,Oral,3.2,149,L,,,\n1098,SOSNet: Second Order Similarity Regularization for Local Descriptor Learning,\"yurun tian (National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences); Xin Yu (Australian National University); Bin Fan (Institute of Automation, Chinese Academy of Sciences, China)*; Fuchao Wu (National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences); Huub Heijnen (Scape Technologies); Vassileios Balntas (Scape Technologies)\",Low-level Vision,\"Recognition: Detection, Categorization, Retrieval\",Oral,3.2,150,L,,,\n2154,“Double-DIP”: Unsupervised Image Decomposition via Coupled Deep-Image-Priors,\"Yosef Gandelsman (Weizmann Institute of Science)*; Assaf Shocher (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel)\",Low-level Vision,Computational Photography; Deep Learning,Oral,3.2,151,L,,,\n2579,Unprocessing Images for Learned Raw Denoising,Tim Brooks (Google)*; Ben Mildenhall (UC Berkeley); Tianfan Xue (MIT); Jiawen Chen (Google); Dillon Sharlet (Google); Jonathan T Barron (Google Research),Low-level Vision,Computational Photography; Deep Learning,Oral,3.2,152,L,,,\n3342,Residual Networks for Light Field Image Super-Resolution,Shuo Zhang (Beijing Jiaotong University)*; Youfang Lin (Beijing Jiaotong University); Hao Sheng (Beihang University),Low-level Vision,3D from Multiview and Sensors; Computational Photography; Deep Learning,Oral,3.2,153,L,,,\n3959,Modulating Image Restoration with Continual Levels via Adaptive Feature Modification Layers,\"Jingwen He (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Chao Dong (SIAT)*; Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)\",Low-level Vision,Deep Learning,Oral,3.2,154,L,,,\n5318,Second-order Attention Network for Single Image Super-resolution,\"Tao Dai (Tsinghua University)*; Jianrui Cai (The Hong Kong Polytechnic University, Hong Kong, China); yongbing zhang (Tsinghua University); Shutao Xia (Tsinghua University); Lei Zhang (\"\"Hong Kong Polytechnic University, Hong Kong, China\"\")\",Low-level Vision,Deep Learning,Oral,3.2,155,L,,,\n2599,Devil is in the Edges: Learning Semantic Boundaries from Noisy Annotations,David Acuna (University of Toronto)*; Amlan Kar (University of Toronto); Sanja Fidler (University of Toronto),\"Segmentation, Grouping and Shape\",Low-level Vision,Oral,3.2,156,L,,,\n3097,Path-Invariant Map Networks,\"Zaiwei Zhang (University of Texas at Austin); Zhenxiao Liang (The University of Texas at Austin); Lemeng Wu (The University of Texas at Austin); Xiaowei Zhou (Zhejiang Univ., China); Qixing Huang (The University of Texas at Austin)*\",Optimization Methods,\"Big Data, Large Scale Methods ; Deep Learning\",Oral,3.2,157,L,,,\n5608,FilterReg: Robust and Efficient Probabilistic Point-Set Registration using Gaussian Filter and Twist Parameterization,Wei Gao (MIT)*; Russ Tedrake (MIT),Low-level Vision,Motion and Tracking; Optimization Methods; RGBD sensors and analytics; Robotics + Driving,Oral,3.2,158,L,,,\n108,Probabilistic Permutation Synchronization using the Riemannian Structure of the Birkhoff Polytope,Tolga Birdal (TU Munich)*; Umut Simsekli (Telecom ParisTech),Optimization Methods,Computer Vision Theory; Low-level Vision; Statistical Learning,Oral,3.2,159,L,,,\n190,Lifting Vectorial Variational Problems: A Natural Formulation based on Geometric Measure Theory and Discrete Exterior Calculus,Thomas Möllenhoff (Technical University of Munich)*; Daniel Cremers (TUM),Optimization Methods,,Oral,3.2,160,L,,,\n1428,A Sufficient Condition for Convergences of Adam and RMSProp,Fangyu Zou (stonybrook); Li Shen (Tencent AI Lab)*; Zequn Jie (Tencent AI Lab); Weizhong Zhang (Tencent AI Lab); Wei Liu (Tencent),Optimization Methods,\"Big Data, Large Scale Methods\",Oral,3.2,161,L,,,\n5959,Guaranteed Matrix Completion under Multiple Linear Transformations,Chao Li (RIKEN)*; Wei He (RIKEN AIP); Longhao Yuan (Saitama Institute of Technology/RIKEN AIP); Zhun Sun (RIKEN Center for AIP); Qibin Zhao (RIKEN),Low-level Vision,Computational Photography; Statistical Learning,Oral,3.2,162,L,,,\n4802,MAP inference via Block-Coordinate Frank-Wolfe Algorithm,\"Paul Swoboda (MPI fuer Informatik, Saarbruecken)*; Vladimir Kolmogorov (Institute of Science and Technology, Austria)\",Optimization Methods,,Oral,3.2,163,L,,,\n5321,A convex relaxation for multi-graph matching,\"Paul Swoboda (MPI fuer Informatik, Saarbruecken)*; Ashkan Mokarian (BIH/MDC); Dagmar Kainmueller (BIH/MDC); Christian Theobalt (MPI Informatik); Florian Bernard (Max Planck Institute for Informatics)\",Optimization Methods,\"Segmentation, Grouping and Shape\",Oral,3.2,164,L,,,\n92,\"Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation\",Anurag Ranjan (MPI for Intelligent Systems)*; Varun Jampani (Nvidia Research); Kihwan Kim (NVIDIA); Deqing Sun (NVIDIA); Lukas Balles (University of Tuebingen); Jonas Wulff (Massachusetts Institute of Technology); Michael J. Black (Max Planck Institute for Intelligent Systems),Low-level Vision,Deep Learning,Poster,3.2,165,L,,,\n5558,Learning Parallax Attention for Stereo Image Super-Resolution,Longguang Wang (National University of Defense Technology); Yingqian Wang (National University of Defense Technology ); Zhengfa Liang (Southwest Electronics and Telecommunication Technology Research Institute); Zaiping Lin (National University of Defense Technology); Jungang Yang (National University of Defense Technology); Wei An (National University of Defense Technology); Yulan Guo (National University of Defense Technology)*,Low-level Vision,Computational Photography; Deep Learning,Poster,3.2,166,L,,,\n5817,Knowing When to Stop: Evaluation and Verification of Conformity to Output-size Specs,Chenglong Wang (University of Washington)*; Rudy R Bunel (University of Oxford); Krishnamurthy Dvijotham (); Po-Sen Huang (DeepMind); Edward Grefenstette (Facebook AI Research); Pushmeet Kohli (DeepMind),Optimization Methods,Deep Learning ; Vision + Language,Poster,3.2,167,L,,,\n5924,Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset,Tianyu Wang (Dalian University of Technology;City University of Hong Kong); Xin Yang (Dalian University of Technology); Ke Xu (Dalian University of Technology;City University of Hong Kong); Shaozhe Chen (Dalian University of Technology); qiang zhang (Key Laboratory of Advanced Design and Intelligent Computing Ministry of Education. Dalian University.); Rynson W.H. Lau (City University of Hong Kong)*,Low-level Vision,,Poster,3.2,168,L,,,\n6049,Focus Loss Functions for Event-based Vision,\"Guillermo Gallego (University of Zurich and ETH Zurich)*; Mathias Gehrig (University of Zurich); Davide Scaramuzza (University of Zurich & ETH Zurich, Switzerland)\",Low-level Vision,Motion and Tracking,Poster,3.2,169,L,,,\n6090,Scalable Convolutional Neural Network for Image Compressed Sensing,\"Wuzhen Shi (Harbin Institute of Technology)*; Feng Jiang (Harbin Institute of Technology, Harbin); Shaohui Liu (Harbin Institute of Technology); Debin Zhao (Harbin Institute of Technology)\",Low-level Vision,\"Big Data, Large Scale Methods ; Deep Learning ; Others; Representation Learning; Vision Applications\",Poster,3.2,170,L,,,\n6183,\"Event Cameras, Contrast Maximization and Reward Functions: an Analysis\",Timo N Stoffregen (Monash University)*; Lindsay Kleeman (Monash University),Low-level Vision,Computer Vision Theory; Motion and Tracking,Poster,3.2,171,L,,,\n6251,Convolutional Neural Networks Deceived by Visual Illusions,Alexander Gomez Villa (Universitat Pompeu Fabra)*; Adrian Martin (Universitat Pompeu Fabra); Javier Vazquez-Corral (University of East Anglia); Marcelo Bertalmío (Universitat Pompeu Fabra),Low-level Vision,Computer Vision Theory; Deep Learning ; Representation Learning,Poster,3.2,172,L,,,\n6423,PDE Acceleration for Active Contours,Minas D Benyamin (Georgia Institute of Technology )*; Ganesh Sundaramoorthi (Kaust); Anthony Yezzi (Georgia Tech),Optimization Methods,\"Computer Vision Theory; Segmentation, Grouping and Shape\",Poster,3.2,173,L,,,\n6493,Dichromatic Model Based Temporal Color Constancy for AC Light Sources,Jun-Sang Yoo (Korea University)*; Jong-Ok Kim (Korea University),Low-level Vision,Computer Vision Theory,Poster,3.2,174,L,,,\n6564,Semantic Attribute Matching Networks,Seungryong Kim (Yonsei University)*; Dongbo Min (Ewha Womans University); Somi Jeong (Yonsei University); Sunok Kim (Yonsei University); Sangryul Jeon (Yonsei university); Kwanghoon Sohn (Yonsei Univ.),Low-level Vision,Vision + Graphics,Poster,3.2,175,L,,,\n6653,Skin-based identification from multispectral image data using CNNs,Takeshi Uemori (Sony Europe Ltd.); Atsushi Ito (Sony Corporation)*; Yusuke Moriuchi (Sony Corporation); Alexander Gatto (Sony Europe Ltd.); Jun Murayama (Sony Corporation),Low-level Vision,Computational Photography; Vision Applications and Systems,Poster,3.2,176,L,,,\n7061,Large-scale Distributed Second-order Optimization Using Kronecker-factored Approximate Curvature for Deep Convolutional Neural Networks,Kazuki Osawa (Tokyo Institute of Technology)*; Yohei Tsuji (Tokyo Institute of Technology); Yuichiro Ueno (Tokyo Institute of Technology); Akira Naruse (NVIDIA); Rio Yokota (Tokyo Institute of Technology); Satoshi Matsuoka (RIKEN),Optimization Methods,\"Big Data, Large Scale Methods ; Deep Learning\",Poster,3.2,177,L,,,\n100,Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments,\"Xueting Li (University of California, Merced)*; Sifei Liu (NVIDIA); Kihwan Kim (NVIDIA); Ming-Hsuan Yang (University of California at Merced); Xiaolong Wang (CMU); Jan Kautz (NVIDIA)\",Scene Analysis and Understanding,Deep Learning,Poster,3.2,178,M,,,\n5042,PIEs: Pose Invariant Embeddings,\"Chih-Hui Ho (University of California San Diego)*; Pedro Morgado (University of California, San Diego); Amir Persekian (University of California, San Diego); Nuno Vasconcelos (UC San Diego)\",Representation Learning,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval\",Poster,3.2,179,M,,,\n5878,Representation Similarity Analysis for Efficient Task taxonomy & Transfer Learning,Kshitij Dwivedi (Singapore University of Technology and Design)*; Gemma Roig (MIT),Representation Learning,\"Deep Learning ; Low-level Vision; Recognition: Detection, Categorization, Retrieval; Scene Analysis\",Poster,3.2,180,M,,,\n6020,Object Counting and Instance Segmentation with Image-level Supervision,Hisham Cholakkal (Inception Institute of Artificial Intelligence)*; Guolei Sun (Inception Institute of Artificial Intelligence); Fahad Shahbaz Khan (Inception Institute of Artificial Intelligence); Ling Shao (Inception Institute of Artificial Intelligence),Scene Analysis and Understanding,\"Segmentation, Grouping and Shape\",Poster,3.2,181,M,,,\n6226,Variational Autoencoders Recover PCA Directions (by Accident),Michal Rolinek (Max Planck Institute for Intelligent Systems)*; Dominik Zietlow (Max Planck Institute for Intelligent Systems); Georg Martius (Max Planck Institute for Intelligent Systems),Representation Learning,Computer Vision Theory; Deep Learning,Poster,3.2,182,M,,,\n6246,A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes,Lichao Mou (DLR&TUM)*; Yuansheng Hua ( German Aerospace Center); Xiaoxiang Zhu (Technical University of Munich (TUM); German Aerospace Center (DLR)),Scene Analysis and Understanding,Deep Learning,Poster,3.2,183,M,,,\n6250,Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping,Suhas Lohit (Arizona State University)*; Qiao Wang (Arizona State University); Pavan Turaga (Arizona State University),Representation Learning,Action Recognition ; RGBD sensors and analytics; Statistical Learning,Poster,3.2,184,M,,,\n6500,PCAN:Learning Attention Map Using Contextual Information for Point Cloud Based Retrieval,Chunxia Xiao (Wuhan University)*; Wenxiao Zhang (Wuhan University),Representation Learning,\"3D from Multiview and Sensors; Recognition: Detection, Categorization, Retrieval\",Poster,3.2,185,M,,,\n6770,Depth Coefficients for Depth Completion,Saif M Imran (Michigan State University)*; Yunfei Long (Michigan State University); Xiaoming Liu (Michigan State University); Daniel Morris (MSU),Scene Analysis and Understanding,Deep Learning ; RGBD sensors and analytics,Poster,3.2,186,M,,,\n6933,Diversify and Match: A Domain Adaptive Representation Learning Paradigm for Object Detection,Taekyung Kim (KAIST); Minki Jeong (KAIST); Seunghyeon Kim (KAIST); Seokeon Choi (KAIST); Changick Kim (KAIST)*,Representation Learning,\"Recognition: Detection, Categorization, Retrieval\",Poster,3.2,187,M,,,\n5508,\"Good News, Everyone! Context driven entity-aware captioning for news images\",\"Ali Furkan Biten (Computer Vision Center)*; Lluis Gomez (Universitat Autónoma de Barcelona); Marçal Rusiñol (Computer Vision Center, UAB); Dimosthenis Karatzas (Computer Vision Centre)\",Vision + Language,Document Analysis,Poster,3.2,188,N,,,\n5592,Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding,Hassan Akbari (Columbia University)*; Svebor Karaman (Columbia University); Surabhi Bhargava (Columbia University); Brian Chen (Columbia University); Carl Vondrick (Columbia University); Shih-Fu Chang (Columbia University),Vision + Language,Deep Learning ; Representation Learning; Visual Reasoning,Poster,3.2,189,N,,,\n5609,Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning,Nayyer Aafaq (The University of Western Australia)*; Naveed Akhtar (The University of Western Australia); Wei Liu (University of Western Australia); Syed Zulqarnain Gilani (The University of Western Australia); Ajmal Mian (University of Western Australia),Vision + Language,Representation Learning; Vision Applications and Systems,Poster,3.2,190,N,,,\n5669,Pointing Novel Objects in Image Captioning,Yehao Li (Sun Yat-Sen University); Ting Yao (JD AI Research); Yingwei Pan (JD AI Research)*; Hongyang Chao (Sun Yat-sen University); Tao Mei (AI Research of JD.com),Vision + Language,,Poster,3.2,191,N,,,\n5957,Informative Object Annotations: Tell Me Something I Don’t Know,Lior Bracha (Bar Ilan University)*; Gal CHECHIK (Bar Ilan University),Vision + Language,\"Recognition: Detection, Categorization, Retrieval; Scene Analysis and Understanding; Statistical Lea\",Poster,3.2,192,N,,,\n5995,Engaging Image Captioning via Personality,Kurt Shuster (Facebook)*; Samuel Humeau (Facebook); Hexiang Hu (USC); Antoine Bordes (Facebook); Jason Weston (FAIR),Vision + Language,Datasets and Evaluation,Poster,3.2,193,N,,,\n6028,Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention,Khanh X Nguyen (University of Maryland)*; Debadeepta Dey (Microsoft); Chris Brockett (Microsoft); Bill Dolan (Microsoft),Vision + Language,Representation Learning; Robotics + Driving; Visual Reasoning,Poster,3.2,194,N,,,\n6459,Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments,Howard Chen (Cornell)*; Alane Suhr (Cornell University); Dipendra Misra (Cornell University); Noah Snavely (Cornell University and Google AI); Yoav Artzi (Cornell University),Vision + Language,Datasets and Evaluation; Visual Reasoning,Poster,3.2,195,N,,,\n6902,A Simple Baseline for Audio-Visual Scene-Aware Dialog,Idan Schwartz (Technion); Alexander Schwing (UIUC)*; Tamir Hazan (Technion),Vision + Language,,Poster,3.2,196,N,,,\n5379,End-to-End Learned Random Walker for Seeded Image Segmentation,Lorenzo Cerrone (HCI/IWR uni heildelberg)*; Alexander Zeilmann (IWR uni heidelberg); Fred Hamprecht (Heidelberg Collaboratory for Image Processing),\"Medical, Biological and Cell Microscopy\",\"Computer Vision Theory; Segmentation, Grouping and Shape\",Poster,3.2,197,O,,,\n5380,Efficient Neural Network Compression,Hyeji Kim (Korea Advanced Institute of Science and Technology)*; Muhammad Umar Karim Khan (Korea Advanced Institute of Science and Technology); Chong-Min Kyung (Korea Advanced Institute of Science and Technology),Vision Applications and Systems,Deep Learning ; Optimization Methods,Poster,3.2,198,O,,,\n5462,Cascaded Generative and Discriminative Learning for Microcalcification Detection in Breast Mammograms,Fandong Zhang (Peking University)*; Ling Luo (Beijing University of Posts and Telecommunications); Xinwei Sun (北京大学); Zhen Zhou (Peking University); Xiuli Li (Deepwise Inc.); Yizhou Wang (PKU); Yizhou Yu (Deepwise AI Lab),\"Medical, Biological and Cell Microscopy\",Deep Learning ; Statistical Learning,Poster,3.2,199,O,,,\n5488,C3AE: Exploring the Limits of Compact Model for Age Estimation,Chao Zhang (UESTC; Megvii)*; Shuaicheng Liu (UESTC; Megvii); Xun Xu (NUS); Ce Zhu (University of Electronic Science & Technology of China),Vision Applications and Systems,\"Deep Learning ; Face, Gesture, and Body Pose ; Recognition: Detection, Categorization, Retrieval\",Poster,3.2,200,O,,,\n5549,Adaptive Weighting Multi-Field-of-View CNN for Semantic Segmentation in Pathology,Hiroki Tokunaga (Kyushu University)*; Yuki Teramoto (Kyoto University); Akihiko Yoshizawa (Kyoto University); Ryoma Bise (Kyushu University),\"Medical, Biological and Cell Microscopy\",\"Deep Learning ; Segmentation, Grouping and Shape\",Poster,3.2,201,O,,,\n5792,In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-driving Images,Marin Oršić (UNIZG-FER)*; Ivan Krešo (UNIZG-FER); Petra Bevandić (Faculty of Electrical Engineering and Computing); Sinisa Segvic (UniZg-FER),Vision Applications and Systems,\"Deep Learning ; Recognition: Detection, Categorization, Retrieval; Robotics + Driving; Scene Analysi\",Poster,3.2,202,O,,,\n5796,Context-Aware Visual Compatibility Prediction,Guillem Cucurull (Element AI)*; Perouz Taslakian (Element AI); David Vazquez (Element AI),Vision Applications and Systems,Deep Learning ; Representation Learning,Poster,3.2,203,O,,,\n5847,Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks,Stephen L James (Imperial College London)*; Paul Wohlhart (Google X); Raia Hadsell (Google Deepmind); Julian Ibarz (Google); Alex Irpan (Google); Dmitry Kalashnikov (Google Inc.); Sergey Levine (Google); Mrinal Kalakrishnan (X); Konstantinos Bousmalis (DeepMind),Robotics + Driving,Deep Learning,Poster,3.2,204,O,,,\n5882,Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation (POINT^2),\"Haofu Liao (University of Rochester)*; Wei-An Lin (The University of Maryland, College Park); Jiarui Zhang (Rutgers University); Jingdan Zhang (Z2AI Corporation); Jiebo Luo (U. Rochester); S. Kevin Zhou (ICT)\",\"Medical, Biological and Cell Microscopy\",Motion and Tracking,Poster,3.2,205,O,,,\n5904,Context-aware Spatio-recurrent Curvilinear Structure Segmentation,Feigege Wang (Fuzhou University); Yue Gu (Fuzhou University); Wenxi Liu (Fuzhou University)*; Yuanlong Yu (Fuzhou University); Shengfeng He (South China University of Technology); Jia Pan (City University of Hong Kong),\"Medical, Biological and Cell Microscopy\",\"Segmentation, Grouping and Shape\",Poster,3.2,206,O,,,\n5911,An Alternative Deep Feature Approach to Line Level Keyword Spotting,George Retsinas (NCSR Demokritos)*; Georgios Louloudis (NCSR Demokritos); Nikolaos Stamatopoulos (NCSR Demokritos); Giorgos Sfikas (NCSR Demokritos); Basilis Gatos (NCSR Demokritos),Document Analysis,Deep Learning ; Vision Applications and Systems,Poster,3.2,207,O,,,\n5993,Dynamics are Important for the Recognition of Equine Pain in Video,Sofia Broomé (KTH Royal Institute of Technology)*; Karina Bech Gleerup (University of Copenhagen); Pia Haubro Andersen (Swedish University of Agricultural Sciences); Hedvig Kjellström (KTH Royal Institute of Technology),Vision Applications and Systems,\"Action Recognition ; Deep Learning ; Face, Gesture, and Body Pose\",Poster,3.2,208,O,,,\n6119,LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving,Ankit Laddha (Uber)*; Greg Meyer (Uber); Carl Wellington (Uber); Carlos Vallespi-Gonzalez (Uber); Eric Kee (Uber ATG),Robotics + Driving,\"Recognition: Detection, Categorization, Retrieval; RGBD sensors and analytics; Scene Analysis and Un\",Poster,3.2,209,O,,,\n6366,Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds,Zihao Liu (Florida International University)*; Xiaowei Xu (University of Notre Dame); Tao Liu (Florida International University); Qi Liu (Florida International University ); Yanzhi Wang (Northeastern University); Yiyu Shi (University of Notre Dame); Wujie Wen (Florida International University); Meiping Huang (Guangdong General Hospital); Haiyun Yuan (Guangdong General Hospital); Jian Zhuang (Guangdong General Hospital),\"Medical, Biological and Cell Microscopy\",\"Segmentation, Grouping and Shape\",Poster,3.2,210,O,,,\n6374,PointPillars: Fast Encoders for 3D Object Detection from Point Clouds,Alex H Lang (nuTonomy: an APTIV company)*; Sourabh Vora (nuTonomy: an APTIV company); Holger Caesar (nuTonomy: an APTIV company); Lubing Zhou (nuTonomy: an APTIV company); Jiong Yang (nuTonomy: an APTIV company); Oscar Beijbom (nuTonomy: an APTIV company),Robotics + Driving,3D from Multiview and Sensors; Deep Learning ; Scene Analysis and Understanding; Vision Applications,Poster,3.2,211,O,,,\n6388,Motion estimation of non-holonomic ground vehicles from a single feature correspondence measured over n views,Kun Huang (ShanghaiTech University)*; Yifu Wang (Australian National University); Laurent Kneip (ShanghaiTech University),Robotics + Driving,3D from Multiview and Sensors; Motion and Tracking,Poster,3.2,212,O,,,\n6575,From Coarse to Fine: Robust Hierarchical Localization at Large Scale,Paul-Edouard Sarlin (ETH Zürich Autonomous Systems Lab); Cesar Cadena (ETH Zurich); Roland Siegwrat (ETH Zürich Autonomous Systems Lab); Marcin T Dymczyk (Sevensense Robotics AG)*,Robotics + Driving,\"Datasets and Evaluation; Deep Learning ; Recognition: Detection, Categorization, Retrieval; Vision A\",Poster,3.2,213,O,,,\n6731,Large Scale High-Resolution Land Cover Mapping with Multi-Resolution Data,Caleb Robinson (Georgia Institute of Technology)*; Nebojsa Jojic (Microsoft Research); Le Hou (Stony Brook University); Kolya Malkin (Yale University); Bistra Dilkina (University of Southern California); Rachel Soobitsky (Chesapeake Conservancy); Jacob Czawlytko (Chesapeake Conservancy),Vision Applications and Systems,\"Segmentation, Grouping and Shape\",Poster,3.2,214,O,,,\n6803,Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting,\"Muming Zhao (University of Technology, Sydney)*; Jian Zhang (UTS); Chongyang Zhang (Shanghai Jiao Tong University); Wenjun Zhang (Shanghai Jiao Tong University)\",Vision Applications and Systems,Deep Learning ; Representation Learning; Scene Analysis and Understanding,Poster,3.2,215,O,,," } ]