Repository: witnessai/Awesome-Open-Vocabulary-Object-Detection Branch: main Commit: 1e476cd2d726 Files: 1 Total size: 21.3 KB Directory structure: gitextract_brax8ug3/ └── README.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: README.md ================================================ # Awesome-Open-Vocabulary-Object-Detection # Contact ``` scottn@foxmail.com ``` # <span id='Papers'>Papers</span> ## 2025 + **CCKT-Det**: Chuhan Zhang, Chaoyang Zhu, Pingcheng Dong, Long Chen, Dong Zhang. **Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection.** ICLR 2025. [[paper]](https://arxiv.org/abs/2503.11005) + **Sambor**: Xumeng Han, Longhui Wei, Xuehui Yu, Zhiyang Dou, Xin He, Kuiran Wang, Yingfei Sun, Zhenjun Han, Qi Tian. **Boosting Segment Anything Model Towards Open-Vocabulary Learning.** AAAI 2025. [[paper]](https://arxiv.org/abs/2312.03628) + **OV-DQUO**: Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang, Yong Xu. **OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision.** AAAI 2025. [[paper]](https://arxiv.org/abs/2405.17913) [[code]](https://github.com/xiaomoguhz/OV-DQUO) + Rohit Bharadwaj, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan. **Enhancing Novel Object Detection via Cooperative Foundational Models. WACV 2025.** [[paper]](https://arxiv.org/abs/2311.12068) [[code]](https://github.com/rohit901/cooperative-foundational-models) + **HD-OVD**: Shenghao Fu, Junkai Yan, Qize Yang, Xihan Wei, Xiaohua Xie, Wei-Shi Zheng. **A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection.** TMM 2025. [[paper]](https://arxiv.org/abs/2503.10152) ## 2024 + Hojun Choi, Junsuk Choe, Hyunjung Shim. **Sampling Bag of Views for Open-Vocabulary Object Detection.** arxiv 2024. [[paper]](https://arxiv.org/abs/2412.18273) + **OV-DINO**: Hao Wang, Pengzhen Ren, Zequn Jie, Xiao Dong, Chengjian Feng, Yinlong Qian, Lin Ma, Dongmei Jiang, Yaowei Wang, Xiangyuan Lan, Xiaodan Liang. **OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion.** arxiv 2024. [[paper]](https://arxiv.org/abs/2407.07844) [[code]](https://github.com/wanghao9610/OV-DINO) + **NRAA**: Sunyuan Qiang, Xianfei Li, Yanyan Liang, Wenlong Liao, Tao He, Pai Peng. **Open-Vocabulary Object Detection via Neighboring Region Attention Alignment.** arxiv 2024. [[paper]](https://arxiv.org/abs/2405.08593) + **AggDet**: Yanhao Zheng, Kai Liu. **Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation.** arxiv 2024. [[paper]](https://arxiv.org/abs/2404.08603) + **OmDet-Turbo**: Tiancheng Zhao, Peng Liu, Xuan He, Lu Zhang, Kyusong Lee. **Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head.** arxiv 2024. [[paper]](https://arxiv.org/abs/2403.06892) [[code]](https://github.com/om-ai-lab/OmDet) + **MIPT**: Guilin Li, Mengdan Zhang, Xiawu Zheng, Peixian Chen, Zihan Wang, Yunhang Shen, Mingchen Zhuge, Chenglin Wu, Fei Chao, Ke Li, Xing Sun, Rongrong Ji. **Multimodal Inplace Prompt Tuning for Open-set Object Detectio.** ACM MM 2024. [[paper]](https://openreview.net/forum?id=J6c0sRkWop) + **CLIFF**: Wuyang Li, Xinyu Liu, Jiayi Ma, Yixuan Yuan. **CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection.** ECCV 2024. [[paper]](https://www.ecva.net/papers/eccv_2024/papers_ECCV/html/7221_ECCV_2024_paper.php) + **OpenSight**: Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu. **OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection.** ECCV 2024. [[paper]](https://arxiv.org/abs/2312.08876) + **CastDet**: Yan Li, Weiwei Guo, Xue Yang, Ning Liao, Dunyun He, Jiaqi Zhou, Wenxian Yu. **Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning.** ECCV 2024. [[paper]](https://arxiv.org/abs/2311.11646) [[code]](https://github.com/VisionXLab/CastDet) + **MarvelOVD**: Kuo Wang, Lechao Cheng, Weikai Chen, Pingping Zhang, Liang Lin, Fan Zhou, Guanbin Li. **MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection.** ECCV 2024. [[paper]](https://arxiv.org/abs/2407.21465) [[code]](https://github.com/wkfdb/MarvelOVD) + **LaMI-DETR**: Penghui Du, Yu Wang, Yifan Sun, Luting Wang, Yue Liao, Gang Zhang, Errui Ding, Yan Wang, Jingdong Wang, Si Liu. **LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction.** ECCV 2024. [[paper]](https://arxiv.org/abs/2407.11335) + **T-Rex2**: Qing Jiang, Feng Li, Zhaoyang Zeng, Tianhe Ren, Shilong Liu, Lei Zhang. **T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy.** ECCV 2024. [[paper]](https://arxiv.org/abs/2403.14610) + **YOLO-World**: Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan. **YOLO-World: Real-Time Open-Vocabulary Object Detection.** CVPR 2024. [[paper]](https://arxiv.org/abs/2401.17270) [[code]](https://github.com/AILab-CVC/YOLO-World) + **SAS-Det**: Shiyu Zhao, Samuel Schulter, Long Zhao, Zhixing Zhang, Vijay Kumar B.G, Yumin Suh, Manmohan Chandraker, Dimitris N. Metaxas. **Taming Self-Training for Open-Vocabulary Object Detection.** CVPR 2024. [[paper]](https://arxiv.org/abs/2308.06412) [[code]](https://github.com/xiaofeng94/SAS-Det) + **LBP**: Jiaming Li, Jiacheng Zhang, Jichang Li, Ge Li, Si Liu, Liang Lin, Guanbin Li. **Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection.** CVPR 2024. [[paper]](https://arxiv.org/abs/2406.00510) + **SHiNe**: Mingxuan Liu, Tyler L. Hayes, Elisa Ricci, Gabriela Csurka, Riccardo Volpi. **SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection.** CVPR 2024. [[paper]](https://arxiv.org/abs/2405.10053) + **BIND**: Heng Zhang, Qiuyu Zhao, Linyu Zheng, Hao Zeng, Zhiwei Ge, Tianhao Li, Sulong Xu. **Exploring Region-Word Alignment in Built-in Detector for Open-Vocabulary Object Detection.** CVPR 2024. [[paper]](https://openaccess.thecvf.com/content/CVPR2024/html/Zhang_Exploring_Region-Word_Alignment_in_Built-in_Detector_for_Open-Vocabulary_Object_Detection_CVPR_2024_paper.html) + **DetCLIPv3**: Lewei Yao, Renjie Pi, Jianhua Han, Xiaodan Liang, Hang Xu, Wei Zhang, Zhenguo Li, Dan Xu. **DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection.** CVPR 2024. [[paper]](https://arxiv.org/abs/2404.09216) + **HyperLearner**: Fanjie Kong, Yanbei Chen, Jiarui Cai, Davide Modolo. **Hyperbolic Learning with Synthetic Captions for Open-World Detection.** CVPR 2024. [[paper]](https://arxiv.org/abs/2404.05016) + **RALF**: Jooyeon Kim, Eulrang Cho, Sehyung Kim, Hyunwoo J. Kim. **Retrieval-Augmented Open-Vocabulary Object Detection.** CVPR 2024. [[paper]](https://arxiv.org/abs/2404.05687) [[code]](https://github.com/mlvlab/RALF) + **InstaGen**: Chengjian Feng, Yujie Zhong, Zequn Jie, Weidi Xie, Lin Ma. **InstaGen: Enhancing Object Detection by Training on Synthetic Dataset.** CVPR 2024. [[paper]](https://arxiv.org/abs/2402.05937) [[code]](https://fcjian.github.io/InstaGen/) + Lorenzo Bianchi, Fabio Carrara, Nicola Messina, Claudio Gennaro, Fabrizio Falchi. **The Devil is in the Fine-Grained Details: Evaluating Open-Vocabulary Object Detectors for Fine-Grained Understanding.** CVPR 2024. [[paper]](https://arxiv.org/abs/2311.17518) [[code]](https://github.com/lorebianchi98/FG-OVD) + Jang Hyun Cho, Philipp Krähenbühl. **Language-conditioned Detection Transformer.** CVPR 2024.[[paper]](https://arxiv.org/abs/2311.17902) + **WSOVOD**: Jianghang Lin, Yunhang Shen, Bingquan Wang, Shaohui Lin, Ke Li, Liujuan Cao. **Weakly Supervised Open-Vocabulary Object Detection.** AAAI 2024. [[paper]](https://arxiv.org/abs/2312.12437) + **CLIM**: Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Wentao Liu, Chen Change Loy. **CLIM: Contrastive Language-Image Mosaic for Region Representation.** AAAI 2024. [[paper]](https://arxiv.org/abs/2312.11376) + **SIC-CADS**: Ruohuan Fang, Guansong Pang, Xiao Bai. **Simple Image-level Classification Improves Open-vocabulary Object Detection.** AAAI 2024. [[paper]](https://arxiv.org/abs/2312.10439) [[code]](https://github.com/mala-lab/SIC-CADS) + **ProxyDet**: Joonhyun Jeong, Geondo Park, Jayeon Yoo, Hyungsik Jung, Heesu Kim. **ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection.** AAAI 2024. [[paper]](https://arxiv.org/abs/2312.07266) [[code]](https://github.com/clovaai/ProxyDet) + **OVDEval**: Yiyang Yao, Peng Liu, Tiancheng Zhao, Qianqian Zhang, Jiajia Liao, Chunxin Fang, Kyusong Lee, Qing Wang. **How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection.** AAAI 2024. [[paper]](https://arxiv.org/abs/2308.13177) [[code]](https://github.com/om-ai-lab/OVDEval) + Chau Pham, Truong Vu, Khoi Nguyen. **LP-OVOD: Open-Vocabulary Object Detection by Linear Probing.** WACV 2024. [[paper]](https://arxiv.org/abs/2310.17109) + **MMC-Det**: Yifan Xu, Mengdan Zhang, Xiaoshan Yang, Changsheng Xu. **Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection.** TIP 2024. [[paper]](https://arxiv.org/abs/2308.15846) + **VTP-OVD**: Yanxin Long, Jianhua Han, Runhui Huang, Xu Hang, Yi Zhu, Chunjing Xu, Xiaodan Liang. **P<sup>3</sup>OVD: Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection.** TNNLS 2024. [[paper]](https://arxiv.org/abs/2211.00849) + **UOVN**: Hengcan Shi, Munawar Hayat, Jianfei Cai. **Unified Open-Vocabulary Dense Visual Prediction.** TMM 2024. [[paper]](https://arxiv.org/abs/2307.08238) + Chaoyang Zhu, Long Chen. **A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future.** TPAMI 2024. [[paper]](https://arxiv.org/abs/2307.09220) ## 2023 + **OpenSD**: Shuai Li, Minghan Li, Pengfei Wang, Lei Zhang. **OpenSD: Unified Open-Vocabulary Segmentation and Detection.** arxiv 2024. [[paper]](https://arxiv.org/abs/2312.06703) + **PLAC**: Sunghun Kang, Junbum Cha, Jonghwan Mun, Byungseok Roh, Chang D. Yoo. **Learning Pseudo-Labeler beyond Noun Concepts for Open-Vocabulary Object Detection.** arxiv 2023. [[paper]](https://arxiv.org/abs/2312.02103) + **MIC**: Zhao Wang, Aoxue Li, Fengwei Zhou, Zhenguo Li, Qi Dou. **Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization.** BMVC 2023. [[paper]](https://proceedings.bmvc2023.org/93/) + **OWL-ST**: Matthias Minderer, Alexey Gritsenko, Neil Houlsby. **Scaling Open-Vocabulary Object Detection.** NeurIPS 2023. [[paper]](https://arxiv.org/abs/2306.09683) + **CoDet**: Chuofan Ma, Yi Jiang, Xin Wen, Zehuan Yuan, Xiaojuan Qi. **CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection.** NeurIPS 2023. [[paper]](https://arxiv.org/abs/2310.16667) [[code]](https://github.com/CVMI-Lab/CoDet) + **SGDN**: Hengcan Shi, Munawar Hayat, Jianfei Cai. **Open-Vocabulary Object Detection via Scene Graph Discovery.** ACM MM 2023. [[paper]](https://arxiv.org/abs/2307.03339) + **DE-ViT**: Xinyu Zhang, Yuting Wang, Abdeslam Boularias. **Detect Every Thing with Few Examples.** GCPR 2023. [[paper]](https://arxiv.org/abs/2309.12969) [[code]](https://github.com/mlzxy/devit) + **DITO**: Dahun Kim, Anelia Angelova, Weicheng Kuo. **Detection-Oriented Image-Text Pretraining for Open-Vocabulary Detection.** arxiv 2023. [[paper]](https://paperswithcode.com/paper/detection-oriented-image-text-pretraining-for) [[code]](https://github.com/google-research/google-research/tree/master/fvlm/dito) + **OpenSeeD**: Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang. **A Simple Framework for Open-Vocabulary Segmentation and Detection.** ICCV 2023. [[paper]](https://arxiv.org/abs/2303.08131) [[code]](https://github.com/IDEA-Research/OpenSeeD) + **CFM-ViT**: Dahun Kim, Anelia Angelova, Weicheng Kuo. **Contrastive Feature Masking Open-Vocabulary Vision Transformer.** ICCV 2023. [[paper]](https://paperswithcode.com/paper/contrastive-feature-masking-open-vocabulary) + **EdaDet**: Cheng Shi, Sibei Yang. **EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment.** ICCV 2023. [[paper]](https://arxiv.org/abs/2309.01151) + **CGG**: Jianzong Wu, Xiangtai Li, Henghui Ding, Xia Li, Guangliang Cheng, Yunhai Tong, Chen Change Loy. **Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation.** ICCV 2023. [[paper]](https://openaccess.thecvf.com/content/ICCV2023/papers/Wu_Betrayed_by_Captions_Joint_Caption_Grounding_and_Generation_for_Open_ICCV_2023_paper.pdf) [[code]](https://github.com/jianzongwu/betrayed-by-captions) + Jincheng Li, Chunyu Xie, Xiaoyu Wu, Bin Wang, Dawei Leng. **What Makes Good Open-Vocabulary Detector: A Disassembling Perspective.** KDD workshop 2023. [[paper]](https://arxiv.org/abs/2309.00227) + **SAS-Det**: Shiyu Zhao, Samuel Schulter, Long Zhao, Zhixing Zhang, Vijay Kumar B. G, Yumin Suh, Manmohan Chandraker, Dimitris N. Metaxas. **Improving Pseudo Labels for Open-Vocabulary Object Detection.** arxiv 2023. [[paper]](https://arxiv.org/abs/2308.06412) + Prannay Kaul, Weidi Xie, Andrew Zisserman. **Multi-Modal Classifiers for Open-Vocabulary Object Detection.** ICML 2023. [[paper]](https://openreview.net/pdf?id=Nuymym2DZF)[[code]](https://github.com/prannaykaul/mm-ovod) + Relja Arandjelović, Alex Andonian, Arthur Mensch, Olivier J. Hénaff, Jean-Baptiste Alayrac, Andrew Zisserman. **Three Ways to Improve Feature Alignment for Open Vocabulary Eetection.** arXiv 2023. [[paper]](https://arxiv.org/abs/2303.13518) + **Prompt-OVD**: Hwanjun Song, Jihwan Bang. **Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection.** arXiv 2023. [[paper]](https://arxiv.org/abs/2303.14386) + **PCL**: Han-Cheol Cho, Won Young Jhoo, Wooyoung Kang, Byungseok Roh. **Open-Vocabulary Object Detection using Pseudo Caption Labels.** arXiv 2023. [[paper]](https://arxiv.org/abs/2303.13040) + **CORA**: Xiaoshi Wu, Feng Zhu, Rui Zhao, Hongsheng Li. **CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching.** CVPR 2023. [[paper]](https://openaccess.thecvf.com/content/CVPR2023/papers/Wu_CORA_Adapting_CLIP_for_Open-Vocabulary_Detection_With_Region_Prompting_and_CVPR_2023_paper.pdf) [[code]](https://github.com/tgxs002/CORA) + **OADP**: Luting Wang, Yi Liu, Penghui Du, Zihan Ding, Yue Liao, Qiaosong Qi, Biaolong Chen, Si Liu. **Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection.** CVPR 2023. [[paper]](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_Object-Aware_Distillation_Pyramid_for_Open-Vocabulary_Object_Detection_CVPR_2023_paper.pdf) [[code]](https://github.com/LutingWang/OADP) + **BARON**: Size Wu, Wenwei Zhang, Sheng Jin, Wentao Liu, Chen Change Loy. **Aligning Bag of Regions for Open-Vocabulary Object Detection.** CVPR 2023. [[paper]](https://openaccess.thecvf.com/content/CVPR2023/papers/Wu_Aligning_Bag_of_Regions_for_Open-Vocabulary_Object_Detection_CVPR_2023_paper.pdf) [[code]](https://github.com/wusize/ovdet) + **RO-ViT**: Dahun Kim, Anelia Angelova, Weicheng Kuo. **Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers.** CVPR 2023. [[paper]](https://openaccess.thecvf.com/content/CVPR2023/papers/Kim_Region-Aware_Pretraining_for_Open-Vocabulary_Object_Detection_With_Vision_Transformers_CVPR_2023_paper.pdf) [[code]](https://github.com/google-research/google-research/tree/master/fvlm/rovit) + **DetCLIPv2**: Lewei Yao, Jianhua Han, Xiaodan Liang, Dan Xu, Wei Zhang, Zhenguo Li, Hang Xu. **DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment.** CVPR 2023. [[paper]](https://openaccess.thecvf.com/content/CVPR2023/papers/Yao_DetCLIPv2_Scalable_Open-Vocabulary_Object_Detection_Pre-Training_via_Word-Region_Alignment_CVPR_2023_paper.pdf) + **CondHead**: Tao Wang. **Learning to Detect and Segment for Open Vocabulary Object Detection.** CVPR 2023. [[paper]](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_Learning_To_Detect_and_Segment_for_Open_Vocabulary_Object_Detection_CVPR_2023_paper.pdf) + **F-VLM**: Weicheng Kuo, Yin Cui, Xiuye Gu, AJ Piergiovanni, Anelia Angelova. **F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models.** ICLR 2023. [[paper]](https://openreview.net/forum?id=MIMwy4kh9lf) [[code]](https://sites.google.com/view/f-vlm/home) + **VLDet**: Chuang Lin, Peize Sun, Yi Jiang, Ping Luo, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan, Jianfei Cai. **Learning Object-Language Alignments for Open-Vocabulary Object Detection.** ICLR 2023. [[paper]](https://openreview.net/pdf?id=mjHlitXvReu) [[code]](https://github.com/clin1223/VLDet) ## 2022 + **MEDet**: Peixian Chen, Kekai Sheng, Mengdan Zhang, Yunhang Shen, Ke Li, Chunhua Shen. **Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization.** arXiv 2022. [[paper]](https://arxiv.org/abs/2206.11134) [[code]](https://github.com/PeixianChen/MEDet) + **LocOV**: Maria A. Bravo, Sudhanshu Mittal, Thomas Brox. **Localized Vision-Language Matching for Open-vocabulary Object Detection.** DAGM German Conference on Pattern Recognition (GCPR) 2022. [[paper]](https://arxiv.org/abs/2205.06160) [[code]](https://github.com/lmb-freiburg/locov) + **Object-Centric-OVD**: Hanoona Rasheed, Muhammad Maaz, Muhammad Uzair Khattak, Salman Khan, Fahad Shahbaz Khan. **Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection.** NeurIPS 2022. [[paper]](https://openreview.net/forum?id=aKXBrj0DHm) [[code]](https://github.com/hanoonaR/object-centric-ovd) + **VL-PLM**: Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B.G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas. **Exploiting Unlabeled Data with Vision and Language Models for Object Detection.** ECCV 2022. [[paper]](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136690156.pdf) [[code]](https://github.com/xiaofeng94/VL-PLM) + **PromptDet**: Chengjian Feng, Yujie Zhong, Zequn Jie, Xiangxiang Chu, Haibing Ren, Xiaolin Wei, Weidi Xie, Lin Ma. **PromptDet: Towards Open-vocabulary Detection using Uncurated Images.** ECCV 2022. [[paper]](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136690691.pdf) [[code]](https://github.com/fcjian/PromptDet) + **OpenSeg**: Golnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin. **Scaling Open-Vocabulary Image Segmentation with Image-Level Labels.** ECCV 2022. [[paper]](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136960532.pdf) [[code]](https://github.com/tensorflow/tpu/tree/641c1ac6e26ed788327b973582cbfa297d7d31e7/models/official/detection/projects/openseg) + **OV-DETR**: Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy. **Open-Vocabulary DETR with Conditional Matching.** ECCV 2022. [[paper]](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136690107.pdf) [[code]](https://github.com/yuhangzang/OV-DETR) + **PB-OVD**: Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, Ran Xu, Wenhao Liu, Caiming Xiong. **Open Vocabulary Object Detection with Pseudo Bounding-Box Labels.** ECCV 2022. [[paper]](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136700263.pdf) [[code]](https://github.com/salesforce/PB-OVD) + **OWL-ViT**: Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby. **Simple Open-Vocabulary Object Detection with Vision Transformers.** ECCV 2022. [[paper]](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136700714.pdf) [[code]](https://github.com/google-research/scenic/tree/main/scenic/projects/owl_vit) + **RegionCLIP**: Yiwu Zhong, Jianwei Yang, Pengchuan Zhang, Chunyuan Li, Noel Codella, Liunian Harold Li, Luowei Zhou, Xiyang Dai, Lu Yuan, Yin Li, Jianfeng Gao. **RegionCLIP: Region-Based Language-Image Pretraining.** CVPR 2022. [[paper]](https://openaccess.thecvf.com/content/CVPR2022/html/Zhong_RegionCLIP_Region-Based_Language-Image_Pretraining_CVPR_2022_paper.html) [[code]](https://github.com/microsoft/RegionCLIP) + **XPM**: Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar. **Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling.** CVPR 2022. [[paper]](https://openaccess.thecvf.com/content/CVPR2022/html/Huynh_Open-Vocabulary_Instance_Segmentation_via_Robust_Cross-Modal_Pseudo-Labeling_CVPR_2022_paper.html) [[code]](https://github.com/hbdat/cvpr22_cross_modal_pseudo_labeling) + **HierKD**: Zongyang Ma, Guan Luo, Jin Gao, Liang Li, Yuxin Chen, Shaoru Wang, Congxuan Zhang, Weiming Hu. **Open-Vocabulary One-Stage Detection With Hierarchical Visual-Language Knowledge Distillation.** CVPR 2022. [[paper]](https://openaccess.thecvf.com/content/CVPR2022/papers/Ma_Open-Vocabulary_One-Stage_Detection_With_Hierarchical_Visual-Language_Knowledge_Distillation_CVPR_2022_paper.pdf) [[code]](https://github.com/mengqiDyangge/HierKD) + **DetPro**: Yu Du, Fangyun Wei, Zihe Zhang, Miaojing Shi, Yue Gao, Guoqi Li. **Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model.** CVPR 2022. [[paper]](https://openaccess.thecvf.com/content/CVPR2022/papers/Du_Learning_To_Prompt_for_Open-Vocabulary_Object_Detection_With_Vision-Language_Model_CVPR_2022_paper.pdf) [[code]](https://github.com/dyabel/detpro) + **ViLD**: Xiuye Gu, Tsung-Yi Lin, Weicheng Kuo, Yin Cui. **Open-vocabulary Object Detection via Vision and Language Knowledge Distillation.** ICLR 2022. [[paper]](https://openreview.net/forum?id=lL3lnMbR4WU) [[code]](https://github.com/tensorflow/tpu/tree/master/models/official/detection/projects/vild) ## 2021 + **OVR-CNN**: Alireza Zareian, Kevin Dela Rosa, Derek Hao Hu, Shih-Fu Chang. **Open-Vocabulary Object Detection Using Captions.** CVPR 2021. [[paper]](https://openaccess.thecvf.com/content/CVPR2021/papers/Zareian_Open-Vocabulary_Object_Detection_Using_Captions_CVPR_2021_paper.pdf) [[code]](https://github.com/alirezazareian/ovr-cnn)
gitextract_brax8ug3/ └── README.md
Condensed preview — 1 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (22K chars).
[
{
"path": "README.md",
"chars": 21778,
"preview": "# Awesome-Open-Vocabulary-Object-Detection\n\n# Contact\n```\nscottn@foxmail.com\n```\n\n# <span id='Papers'>Papers</span>\n## 2"
}
]
About this extraction
This page contains the full source code of the witnessai/Awesome-Open-Vocabulary-Object-Detection GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 1 files (21.3 KB), approximately 7.0k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.