Repository: sbrugman/deep-learning-papers Branch: master Commit: 358e237239de Files: 3 Total size: 49.4 KB Directory structure: gitextract_a3mjjkf_/ ├── .gitignore ├── README.md └── task-papers.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ .idea/* ================================================ FILE: README.md ================================================ # Deep Learning Papers by task Papers about deep learning ordered by task, date. For each paper there is a permanent link, which is either to Arxiv.org or to a copy of the original paper in this repository. # Table of Contents 1. [Code](#code) 1.1. [Code Generation](#code-generation) 1.2. [Malware Detection and Security](#malware-detection-and-security) 2. [Text](#text) 2.1. [Summarization](#summarization) 2.2. [Taskbots](#taskbots) 2.3. [Classification](#classification) 2.4. [Question Answering](#question-answering) 2.5. [Sentiment Analysis](#sentiment-analysis) 2.6. [Translation](#translation) 2.7. [Chatbots](#chatbots) 2.8. [Reasoning](#reasoning) 2.9. [Language Representation](#language-representation) 3. [Visual](#visual) 3.1. [Gaming](#gaming) 3.2. [Style Transfer](#style-transfer) 3.3. [Object Tracking](#object-tracking) 3.4. [Visual Question Answering](#visual-question-answering) 3.5. [Image Segmentation](#image-segmentation) 3.6. [Text (in the Wild) Recognition](#text-in-the-wild-recognition) 3.7. [Brain Computer Interfacing](#brain-computer-interfacing) 3.8. [Self-Driving Cars](#self-driving-cars) 3.9. [Object Recognition](#object-recognition) 3.10. [Logo Recognition](#logo-recognition) 3.11. [Super Resolution](#super-resolution) 3.12. [Pose Estimation](#pose-estimation) 3.13. [Image Captioning](#image-captioning) 3.14. [Image Compression](#image-compression) 3.15. [Image Synthesis](#image-synthesis) 3.16. [Face Recognition](#face-recognition) 3.17. [Image Composition](#image-composition) 3.18. [Scene Graph Parsing](#scene-graph-parsing) 3.19. [Video Deblurring](#video-deblurring) 3.20. [Depth Perception](#depth-perception) 3.21. [3D Reconstruction](#3d-reconstruction) 3.22. [Vision Representation](#vision-representation) 4. [Audio](#audio) 4.1. [Audio Synthesis](#audio-synthesis) 5. [Other](#other) 5.1. [Unclassified](#unclassified) 5.2. [Regularization](#regularization) 5.3. [Neural Network Compression](#neural-network-compression) 5.4. [Optimizers](#optimizers) ## Code ### Code Generation |Title|Date|Paper|Code| |---|---|---|---| | DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07734) | | | A Syntactic Neural Model for General-Purpose Code Generation | _6 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.01696) | | | RobustFill: Neural Program Learning under Noisy I/O | _21 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.07469) | | | DeepFix: Fixing Common C Language Errors by Deep Learning | _12 feb 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deepfix-fixing-common-c-language-errors-by-deep-learning.pdf) | | | DeepCoder: Learning to Write Programs | _7 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01989) | | | Neuro-Symbolic Program Synthesis | _6 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01855) | | | Deep API Learning | _27 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.08535) | | ### Malware Detection and Security |Title|Date|Paper|Code| |---|---|---|---| | PassGAN: A Deep Learning Approach for Password Guessing | _1 sep 2017_ | [arxiv](https://arxiv.org/pdf/1709.00440) | | | Deep Android Malware Detection | _22 mar 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-android-malware-detection.pdf) | [github](https://github.com/niallmcl/Deep-Android-Malware-Detection) | | Droid-Sec: Deep Learning in Android Malware Detection | _17 aug 2014_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/droid-sec-deep-learning-in-android-malware-detection.pdf) | [github](https://github.com/pjlantz/droidbox) | ## Text ### Summarization |Title|Date|Paper|Code| |---|---|---|---| | A Deep Reinforced Model for Abstractive Summarization | _11 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.04304) | | | Get To The Point: Summarization with Pointer-Generator Networks | _14 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.04368) | | | SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents | _14 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.04230) | | ### Taskbots |Title|Date|Paper|Code| |---|---|---|---| | Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning | _10 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03084) | [github](https://github.com/MiuLab/TC-Bot) | | End-to-End Task-Completion Neural Dialogue Systems | _3 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.01008) | [github](https://github.com/MiuLab/TC-Bot) | ### Classification |Title|Date|Paper|Code| |---|---|---|---| | A Large Self-Annotated Corpus for Sarcasm | _19 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05579) | | | ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03560) | | | Bilateral Multi-Perspective Matching for Natural Language Sentences | _13 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.03814) | | | FastText.zip: Compressing text classification models | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03651) | | | ConceptNet 5.5: An Open Multilingual Graph of General Knowledge | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03975) | | | A Simple but Tough-to-Beat Baseline for Sentence Embeddings | _4 nov 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/a-simple-but-tough-to-beat-baseline-for-sentence-embeddings.pdf) | [github](https://github.com/YingyuLiang/SIF) | | Enriching Word Vectors with Subword Information | _15 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.04606) | | | From Word Embeddings To Document Distances | _6 jul 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/from-word-embeddings-to-document-distances.pdf) | [github](https://github.com/mkusner/wmd) | | Bag of Tricks for Efficient Text Classification | _6 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.01759) | | | Character-level Convolutional Networks for Text Classification | _4 sep 2015_ | [arxiv](https://arxiv.org/pdf/1509.01626) | | | GloVe: Global Vectors for Word Representation | _25 may 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/glove-global-vectors-for-word-representation.pdf) | [github](https://github.com/stanfordnlp/GloVe) | | Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks | _28 feb 2015_ | [arxiv](https://arxiv.org/pdf/1503.00075) | | | Distributed Representations of Sentences and Documents | _16 may 2014_ | [arxiv](https://arxiv.org/pdf/1405.4053) | | | Efficient Estimation of Word Representations in Vector Space | _16 jan 2013_ | [arxiv](https://arxiv.org/pdf/1301.3781) | | | SimHash: Hash-based Similarity Detection | _13 dec 2007_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/simhash-hash-based-similarity-detection.pdf) | | ### Question Answering |Title|Date|Paper|Code| |---|---|---|---| | IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models | _30 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.10513) | [github](https://github.com/geek-ai/irgan) | ### Sentiment Analysis |Title|Date|Paper|Code| |---|---|---|---| | Rationalizing Neural Predictions | _13 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.04155) | | | Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank | _18 okt 2013_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/recursive-deep-models-for-semantic-compositionality-over-a-sentiment-treebank.pdf) | | ### Translation |Title|Date|Paper|Code| |---|---|---|---| | Attention Is All You Need | _12 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.03762) | | | Convolutional Sequence to Sequence Learning | _8 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.03122) | [github](https://github.com/facebookresearch/fairseq) | | Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation | _14 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.04558) | | | A Convolutional Encoder Model for Neural Machine Translation | _7 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.02344) | | | Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation | _26 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.08144) | | | Neural Machine Translation by Jointly Learning to Align and Translate | _1 sep 2014_ | [arxiv](https://arxiv.org/pdf/1409.0473) | | ### Chatbots |Title|Date|Paper|Code| |---|---|---|---| | A Deep Reinforcement Learning Chatbot | _7 sep 2017_ | [arxiv](https://arxiv.org/pdf/1709.02349) | | | A Neural Conversational Model | _19 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.05869) | [github](https://github.com/inikdom/neural-chatbot) | ### Reasoning |Title|Date|Paper|Code| |---|---|---|---| | NeuroSAT: Learning a SAT Solver from Single-Bit Supervision | _5 jan 2019_ | [arxiv](https://arxiv.org/pdf/1802.03685.pdf) | [github](https://github.com/dselsam/neurosat) | | Tracking the World State with Recurrent Entity Networks | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03969) | | ### Language Representation |Title|Date|Paper|Code| |---|---|---|---| | Efficient Estimation of Word Representations in Vector Space | _7 sep 2013_ | [arxiv](https://arxiv.org/pdf/1301.3781.pdf) | | | Distributed Representations of Words and Phrases and their Compositionality | _16 okt 2013_ | [arxiv](https://arxiv.org/pdf/1310.4546.pdf) | | ELMO: Deep contextualized word representations | _22 Mar 2018_ | [arxiv](https://arxiv.org/pdf/1802.05365.pdf) | | | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | _24 may 2019_ | [arxiv](https://arxiv.org/pdf/1810.04805.pdf) | [github](https://github.com/google-research/bert) | | XLNet: Generalized Autoregressive Pretraining for Language Understanding | _19 jun 2019_ | [arxiv](https://arxiv.org/pdf/1906.08237.pdf) | [github](https://github.com/zihangdai/xlnet) | | RoBERTa: A Robustly Optimized BERT Pretraining Approach | _26 jul 2019_ | [arxiv](https://arxiv.org/pdf/1907.11692.pdf) | [github](https://github.com/pytorch/fairseq/tree/master/examples/roberta) | ## Visual ### Gaming |Title|Date|Paper|Code| |---|---|---|---| | Phase-Functioned Neural Networks for Character Control | _1 may 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/phase-functioned-neural-networks-for-character-control.pdf) | | | Equivalence Between Policy Gradients and Soft Q-Learning | _21 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.06440) | | | Beating Atari with Natural Language Guided Reinforcement Learning | _18 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05539) | | | Learning from Demonstrations for Real World Reinforcement Learning | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03732) | | | FeUdal Networks for Hierarchical Reinforcement Learning | _3 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.01161) | | | Overcoming catastrophic forgetting in neural networks | _2 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00796) | | | Playing Doom with SLAM-Augmented Deep Reinforcement Learning | _1 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00380) | | | Playing FPS Games with Deep Reinforcement Learning | _18 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05521) | | | DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess | _16 aug 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deepchess-end-to-end-deep-neural-network-for-automatic-learning-in-chess.pdf) | | | Generative Adversarial Imitation Learning | _10 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.03476) | | | Dueling Network Architectures for Deep Reinforcement Learning | _20 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.06581) | | | Prioritized Experience Replay | _18 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.05952) | | | Human-level control through deep reinforcement learning | _26 feb 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/human-level-control-through-deep-reinforcement-learning.pdf) | | | Playing Atari with Deep Reinforcement Learning | _19 dec 2013_ | [arxiv](https://arxiv.org/pdf/1312.5602) | | ### Style Transfer |Title|Date|Paper|Code| |---|---|---|---| | The Contextual Loss for Image Transformation with Non-Aligned Data | _18 jul 2018_ | [arxiv](https://arxiv.org/pdf/1803.02077.pdf) | [github](https://github.com/roimehrez/contextualLoss) | | Deep Photo Style Transfer | _22 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.07511) | | | Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization | _20 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06868) | [github](https://github.com/xunhuang1995/AdaIN-style) | | A Learned Representation For Artistic Style | _24 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.07629) | | | Instance Normalization: The Missing Ingredient for Fast Stylization | _27 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.08022) | | | Perceptual Losses for Real-Time Style Transfer and Super-Resolution | _27 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.08155) | [github](http://github.com/jcjohnson/fast-neural-style) | | A Neural Algorithm of Artistic Style | _26 aug 2015_ | [arxiv](https://arxiv.org/pdf/1508.06576) | [github](https://github.com/lengstrom/fast-style-transfer/) | ### Object Tracking |Title|Date|Paper|Code| |---|---|---|---| | End-to-end representation learning for Correlation Filter based tracking | _20 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.06036) | [github](https://github.com/bertinetto/cfnet) | ### Visual Question Answering |Title|Date|Paper|Code| |---|---|---|---| | VQA: Visual Question Answering | _3 may 2015_ | [arxiv](https://arxiv.org/pdf/1505.00468) | | ### Image Segmentation |Title|Date|Paper|Code| |---|---|---|---| | PointRend: Image Segmentation as Rendering | _17 dec 2019_ | [arxiv](https://arxiv.org/pdf/1912.08193.pdf) | | | Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation | _22 aug 2018_ | [paper](https://arxiv.org/pdf/1802.02611.pdf) [cvpr](http://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Auto-DeepLab_Hierarchical_Neural_Architecture_Search_for_Semantic_Image_Segmentation_CVPR_2019_paper.pdf) | [github](https://github.com/tensorflow/models/tree/master/research/deeplab) | | Dilated Residual Networks | _22 jul 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/dilated-residual-networks.pdf) | | | SfM-Net: Learning of Structure and Motion from Video | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07804) | | | Semi and Weakly Supervised Semantic Segmentation Using Generative Adversarial Network | _28 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.09695) | | | Mask R-CNN | _20 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06870) | | | Learning Features by Watching Objects Move | _19 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.06370) | | | RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation | _20 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.06612) | [github](https://github.com/guosheng/refinenet) | | UberNet: Training a `Universal' Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory | _7 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.02132) | | | DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs | _2 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.00915) | | | Fully Convolutional Networks for Semantic Segmentation | _20 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.06211) | [github](https://github.com/shelhamer/fcn.berkeleyvision.org) | | Instance-aware Semantic Segmentation via Multi-task Network Cascades | _14 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.04412) | | | Multi-Scale Context Aggregation by Dilated Convolutions | _23 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.07122) | | | SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation | _2 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.00561) | | | U-Net: Convolutional Networks for Biomedical Image Segmentation | _18 may 2015_ | [arxiv](https://arxiv.org/pdf/1505.04597) | | | Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs | _22 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.7062) | | | Learning Rich Features from RGB-D Images for Object Detection and Segmentation | _22 jul 2014_ | [arxiv](https://arxiv.org/pdf/1407.5736) | | ### Text (in the Wild) Recognition |Title|Date|Paper|Code| |---|---|---|---| | OCR Error Correction Using Character Correction and Feature-Based Word Classification | _21 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.06225) | | | Recursive Recurrent Nets with Attention Modeling for OCR in the Wild | _9 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.03101) | | | COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images | _26 jan 2016_ | [arxiv](https://arxiv.org/pdf/1601.07140) | | | Efficient Scene Text Localization and Recognition with Local Character Refinement | _14 apr 2015_ | [arxiv](https://arxiv.org/pdf/1504.03522) | | | Reading Text in the Wild with Convolutional Neural Networks | _4 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.1842) | | | Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition | _9 jun 2014_ | [arxiv](https://arxiv.org/pdf/1406.2227) | | ### Brain Computer Interfacing |Title|Date|Paper|Code| |---|---|---|---| | Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG | _15 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.05051) | | | Encoding Voxels with Deep Learning | _2 dec 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/encoding-voxels-with-deep-learning.pdf) | | | Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream | _8 jul 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-neural-networks-reveal-a-gradient-in-the-complexity-of-neural-representations-across-the-ventral-stream.pdf) | | ### Self-Driving Cars |Title|Date|Paper|Code| |---|---|---|---| | Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art | _18 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05519) | | | End to End Learning for Self-Driving Cars | _25 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.07316) | | ### Object Recognition |Title|Date|Paper|Code| |---|---|---|---| | Cascade R-CNN: High Quality Object Detection and Instance Segmentation | _24 Jun 2019_ | [arxiv](https://arxiv.org/pdf/1906.09756.pdf) | [github](https://github.com/zhaoweicai/cascade-rcnn) | | YOLOv3: An Incremental Improvement | _8 Apr 2018_ | [arxiv](https://arxiv.org/pdf/1804.02767.pdf) | [github]( https://github.com/pjreddie/darknet), [github reimplementation](https://github.com/ultralytics/yolov3) | | Focal Loss for Dense Object Detection | _7 aug 2017_ | [arxiv](https://arxiv.org/pdf/1708.02002) | | | Introspective Classifier Learning: Empower Generatively | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07816) | | | Learning Chained Deep Features and Classifiers for Cascade in Object Detection | _23 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.07054) | | | DSSD : Deconvolutional Single Shot Detector | _23 jan 2017_ | [arxiv](https://arxiv.org/pdf/1701.06659) | | | YOLO9000: Better, Faster, Stronger | _25 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.08242) | [github](https://github.com/pjreddie/darknet) | | Feature Pyramid Networks for Object Detection | _9 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03144) | | | Speed/accuracy trade-offs for modern convolutional object detectors | _30 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.10012) | | | Aggregated Residual Transformations for Deep Neural Networks | _16 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.05431) | | | Aggregated Residual Transformations for Deep Neural Networks | _16 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.05431) | | | Hierarchical Object Detection with Deep Reinforcement Learning | _11 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.03718) | | | Xception: Deep Learning with Depthwise Separable Convolutions | _7 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.02357) | | | Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition | _1 okt 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/learning-to-make-better-mistakes-semantics-aware-visual-food-recognition.pdf) | | | Densely Connected Convolutional Networks | _25 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.06993) | | | Residual Networks of Residual Networks: Multilevel Residual Networks | _9 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.02908) | | | Context Matters: Refining Object Detection in Video with Recurrent Neural Networks | _15 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.04648) | | | R-FCN: Object Detection via Region-based Fully Convolutional Networks | _20 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.06409) | | | Training Region-based Object Detectors with Online Hard Example Mining | _12 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.03540) | | | T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos | _9 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.02532) | | | Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning | _23 feb 2016_ | [arxiv](https://arxiv.org/pdf/1602.07261) | | | Deep Residual Learning for Image Recognition | _10 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.03385) | | | SSD: Single Shot MultiBox Detector | _8 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.02325) | | | Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) | _23 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.07289) | | | ParseNet: Looking Wider to See Better | _15 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.04579) | | | You Only Look Once: Unified, Real-Time Object Detection | _8 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02640) | | | Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | _4 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.01497) | | | Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification | _6 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.01852) | | | Deep Image: Scaling up Image Recognition | _13 jan 2015_ | [arxiv](https://arxiv.org/pdf/1501.02876) | | | Rich feature hierarchies for accurate object detection and semantic segmentation | _11 nov 2013_ | [arxiv](https://arxiv.org/pdf/1311.2524) | | | Selective Search for Object Recognition | _11 mar 2013_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/selective-search-for-object-recognition.pdf) | | | ImageNet Classification with Deep Convolutional Neural Networks | _3 dec 2012_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/imagenet-classification-with-deep-convolutional-neural-networks.pdf) | | ### Logo Recognition |Title|Date|Paper|Code| |---|---|---|---| | Deep Learning Logo Detection with Data Expansion by Synthesising Context | _29 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.09322) | | | Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks | _20 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.06083) | | | LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks | _8 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.02462) | | | DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer | _7 okt 2015_ | [arxiv](https://arxiv.org/pdf/1510.02131) | | ### Super Resolution |Title|Date|Paper|Code| |---|---|---|---| | Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network | _16 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05158) | | | Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network | _15 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.04802) | | | RAISR: Rapid and Accurate Image Super Resolution | _3 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.01299) | | | Perceptual Losses for Real-Time Style Transfer and Super-Resolution | _27 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.08155) | [github](http://github.com/jcjohnson/fast-neural-style) | | Image Super-Resolution Using Deep Convolutional Networks | _31 dec 2014_ | [arxiv](https://arxiv.org/pdf/1501.00092) | | ### Pose Estimation |Title|Date|Paper|Code| |---|---|---|---| | Forecasting Human Dynamics from Static Images | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03432) | | | Fast Single Shot Detection and Pose Estimation | _19 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05590) | | | Convolutional Pose Machines | _30 jan 2016_ | [arxiv](https://arxiv.org/pdf/1602.00134) | | | Flowing ConvNets for Human Pose Estimation in Videos | _9 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02897) | | ### Image Captioning |Title|Date|Paper|Code| |---|---|---|---| | Actor-Critic Sequence Training for Image Captioning | _29 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.09601) | | | Detecting and Recognizing Human-Object Interactions | _24 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07333) | | | Deep Reinforcement Learning-based Image Captioning with Embedding Reward | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03899) | | | Towards Diverse and Natural Image Descriptions via a Conditional GAN | _17 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06029) | | | Temporal Tessellation: A Unified Approach for Video Analysis | _21 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.06950) | [github](https://github.com/dot27/temporal-tessellation) | | Self-critical Sequence Training for Image Captioning | _2 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00563) | | | Generation and Comprehension of Unambiguous Object Descriptions | _7 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.02283) | | | Show, Attend and Tell: Neural Image Caption Generation with Visual Attention | _10 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.03044) | | | Long-term Recurrent Convolutional Networks for Visual Recognition and Description | _17 nov 2014_ | [arxiv](https://arxiv.org/pdf/1411.4389) | | ### Image Compression |Title|Date|Paper|Code| |---|---|---|---| | Full Resolution Image Compression with Recurrent Neural Networks | _18 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.05148) | | ### Image Synthesis |Title|Date|Paper|Code| |---|---|---|---| | Scene Text Synthesis for Efficient and Effective Deep Network Training | _26 jan 2019_ | [arxiv](https://arxiv.org/pdf/1901.09193.pdf) | | | A Neural Representation of Sketch Drawings | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03477) | | | BEGAN: Boundary Equilibrium Generative Adversarial Networks | _31 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10717) | [github](https://github.com/carpedm20/BEGAN-tensorflow) | | Improved Training of Wasserstein GANs | _31 mar 2017_ | [arxiv](https://arxiv.org/pdf/1704.00028) | | | Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks | _30 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10593) | [github](https://github.com/junyanz/CycleGAN) | | Wasserstein GAN | _26 jan 2017_ | [arxiv](https://arxiv.org/pdf/1701.07875) | | | RenderGAN: Generating Realistic Labeled Data | _4 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01331) | | | Conditional Image Generation with PixelCNN Decoders | _16 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.05328) | | | Pixel Recurrent Neural Networks | _25 jan 2016_ | [arxiv](https://arxiv.org/pdf/1601.06759) | | | Generative Adversarial Networks | _10 jun 2014_ | [arxiv](https://arxiv.org/pdf/1406.2661) | | ### Face Recognition |Title|Date|Paper|Code| |---|---|---|---| | Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition | _24 okt 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/accessorize-to-a-crime-real-and-stealthy-attacks-on-state-of-the-art-face-recognition.pdf) | | | OpenFace: A general-purpose face recognition library with mobile applications | _1 jun 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/openface-a-general-purpose-face-recognition-library-with-mobile-applications.pdf) | | | Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns | _9 nov 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/emotion-recognition-in-the-wild-via-convolutional-neural-networks-and-mapped-binary-patterns.pdf) | | | Deep Face Recognition | _7 sep 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-face-recognition.pdf) | | | Compact Convolutional Neural Network Cascade for Face Detection | _6 aug 2015_ | [arxiv](https://arxiv.org/pdf/1508.01292) | | | Learning Robust Deep Face Representation | _17 jul 2015_ | [arxiv](https://arxiv.org/pdf/1507.04844) | | | Facenet: A unified embedding for face recognition and clustering | _12 jun 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/facenet-a-unified-embedding-for-face-recognition-and-clustering.pdf) | | | Multi-view Face Detection Using Deep Convolutional Neural Networks | _10 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.02766) | | ### Image Composition |Title|Date|Paper|Code| |---|---|---|---| | Auto-Retoucher(ART) — A Framework for Background Replacement and Foreground Adjustment | _13 jan 2019_ | [arxiv](https://arxiv.org/pdf/1901.03954.pdf) (brave new task) | [github](https://github.com/woshiyyya/Auto-Retoucher-pytorch) (not able to reproduce results based on code) | | Spatial Fusion GAN for Image Synthesis | _14 Dec 2018_ | [arxiv](https://arxiv.org/pdf/1812.05840.pdf) (needs revision, interesting approach however) | [github](https://github.com/fnzhan/SF-GAN) (currently, no code available) | | Compositional GAN: Learning Conditional Image Composition | _23 Aug 2018_ | [arxiv](https://arxiv.org/pdf/1807.07560.pdf) (with respect to spatial orientation) | [github](https://github.com/azadis/CompositionalGAN) (currently, no code available) | | ST-GAN | _5 mar 2018_ | [arxiv](https://arxiv.org/pdf/1803.01837) (with respect to spatial orientation) | [github](https://github.com/chenhsuanlin/spatial-transformer-GAN) | | Deep Painterly Harmonization | _26 Jun 2018_ | [paper](https://arxiv.org/pdf/1804.03189.pdf) | [github](https://github.com/luanfujun/deep-painterly-harmonization) | | Deep Image Harmonization | _28 feb 2017_ | [paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Tsai_Deep_Image_Harmonization_CVPR_2017_paper.pdf) | [github](https://github.com/wasidennis/DeepHarmonization) (only code for inference) | | Understanding and Improving the Realism of Image Composites | _1 Jul 2012_ | [paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.682.3987&rep=rep1&type=pdf) | | ### Scene Graph Parsing |Title|Date|Paper|Code| |---|---|---|---| | Neural Motifs: Scene Graph Parsing with Global Context | _29 Mar 2018_ | [arxiv](https://arxiv.org/pdf/1711.06640.pdf) | [github](https://github.com/rowanz/neural-motifs) | ### Video Deblurring |Title|Date|Paper|Code| |---|---|---|---| | Spatio-Temporal Filter Adaptive Network for Video Deblurring | _28 Apr 2019_ | [arxiv](https://arxiv.org/pdf/1904.12257.pdf) | [github](https://shangchenzhou.com/projects/stfan/) (to appear) | ### Depth Perception |Title|Date|Paper|Code| |---|---|---|---| | Learning Depth with Convolutional Spatial Propagation Network | _13 Okt 2018_ | [arxiv](https://arxiv.org/pdf/1810.02695.pdf) | [github](https://github.com/XinJCheng/CSPN) | | Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches | _18 May 2016_ | [arxiv](https://arxiv.org/pdf/1510.05970.pdf) | [github](https://github.com/jzbontar/mc-cnn) | ### 3D Reconstruction |Title|Date|Paper|Code| |---|---|---|---| | Cerberus: A Multi-headed Derenderer | _28 May 2019_ | [arxiv](https://arxiv.org/pdf/1905.11940.pdf) | | ### Vision Representation |Title|Date|Paper|Code| |---|---|---|---| | VisualBERT: A Simple and Performant Baseline for Vision and Language | _9 aug 2019_ | [arxiv](https://arxiv.org/pdf/1908.03557.pdf) | | | Expected to appear: some paper learning an unsupervised vision representation that beats SOTA on a large number of tasks | <Before the end of 2019> | | | ## Audio ### Audio Synthesis |Title|Date|Paper|Code| |---|---|---|---| | Deep Cross-Modal Audio-Visual Generation | _26 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.08292) | | | A Neural Parametric Singing Synthesizer | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03809) | | | Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders | _5 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.01279) | [github](https://github.com/tensorflow/magenta/tree/master/magenta/models/nsynth) | | Tacotron: Towards End-to-End Speech Synthesis | _29 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10135) | [github](https://github.com/Kyubyong/tacotron) | | Deep Voice: Real-time Neural Text-to-Speech | _25 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.07825) | | | WaveNet: A Generative Model for Raw Audio | _12 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.03499) | [github](https://github.com/ibab/tensorflow-wavenet) | ## Other ### Unclassified |Title|Date|Paper|Code| |---|---|---|---| | A simple neural network module for relational reasoning | _5 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.01427) | | | Deep Complex Networks | _27 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.09792) | [github](https://github.com/ChihebTrabelsi/deep_complex_networks) | | Learning to Fly by Crashing | _19 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05588) | | | Who Said What: Modeling Individual Labelers Improves Classification | _26 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.08774) | | | Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data | _18 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.05755) | | | DeepMath - Deep Sequence Models for Premise Selection | _14 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.04442) | | | Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue | _16 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.04992) | | | Long Short-Term Memory | _15 nov 1997_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/long-short-term-memory.pdf) | | ### Regularization |Title|Date|Paper|Code| |---|---|---|---| | Self-Normalizing Neural Networks | _8 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.02515) | | | Concrete Dropout | _22 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.07832) | [github](https://github.com/yaringal/ConcreteDropout) | | Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning | _6 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02142) | | | Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift | _11 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.03167) | | ### Neural Network Compression |Title|Date|Paper|Code| |---|---|---|---| | Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial "Bottleneck" Structure | _15 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.04337) | | | SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size | _24 feb 2016_ | [arxiv](https://arxiv.org/pdf/1602.07360) | | | Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding | _1 okt 2015_ | [arxiv](https://arxiv.org/pdf/1510.00149) | | ### Optimizers |Title|Date|Paper|Code| |---|---|---|---| | Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour | _8 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.02677) | | | Equilibrated adaptive learning rates for non-convex optimization | _15 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.04390) | | | Adam: A Method for Stochastic Optimization | _22 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.6980) | | | Deep learning with Elastic Averaging SGD | _20 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.6651) | | | ADADELTA: An Adaptive Learning Rate Method | _22 dec 2012_ | [arxiv](https://arxiv.org/pdf/1212.5701) | | | Advances in Optimizing Recurrent Networks | _4 dec 2012_ | [arxiv](https://arxiv.org/pdf/1212.0901) | | | Efficient Backprop | _1 jul 1998_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/efficient-backprop.pdf) | | ## A note on arXiv arXiv provides the world with access to the newest scientific developments. Open Access has a myriad of benefits, in particular, it allows science to be more efficient. Remember to think about the quality of the papers referenced. In particular, the importance of the [peer-review process](https://undsci.berkeley.edu/article/howscienceworks_16) for science. If you find an article on arXiv you should check if it has been peer-reviewed and published elsewhere. The authoritative version of the paper is not the version on arXiv, rather it is the published peer-reviewed version. The two versions may differ significantly. For example, this is the case with one of the papers that I once discussed in the Text and Multimedia Mining class at Radboud: - [peer-reviewed version](http://opus.bath.ac.uk/55288/4/CaliskanEtAl_authors_full.pdf) - [arXiv version](https://arxiv.org/abs/1608.07187) Compare for yourself. For the selection of the papers above, I choose open access over completeness. If you find another (open) version of a paper, you are invited to make a pull request. ================================================ FILE: task-papers.md ================================================ # Deep Learning Papers by task Papers about deep learning ordered by task, date. Current state-of-the-art papers and papers useful for getting started are labelled. ## Object Recognition * YOLO9000: Better, Faster, Stronger, dec 2016, [arxiv](https://arxiv.org/pdf/1612.08242v1)  * Feature Pyramid Networks for Object Detection, dec 2016, [arxiv](https://arxiv.org/pdf/1612.03144.pdf) * Densely Connected Convolutional Networks, dec 2016, [arxiv](https://arxiv.org/pdf/1608.06993.pdf) * PVANet: Lightweight Deep Neural Networks for Real-time Object Detection, dec 2016, [arxiv](https://arxiv.org/pdf/1611.08588.pdf) * Aggregated Residual Transformations for Deep Neural Networks, nov 2016, [arxiv](https://arxiv.org/pdf/1611.05431v1.pdf) * Hierarchical Object Detection with Deep Reinforcement Learning, nov 2016, [arxiv](https://arxiv.org/pdf/1611.03718v1.pdf) * Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition, okt 2016, IBM, [paper](https://github.com/sbrugman/deep-learning-papers/blob/master/papers/learning-to-make-better-mistakes-semantics-aware-visual-food-recognition.pdf) * T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos, aug 2016, [github](https://github.com/myfavouritekk/T-CNN), [arxiv](https://arxiv.org/pdf/1604.02532v3) * Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, aug 2016, Google, [arxiv](https://arxiv.org/pdf/1602.07261.pdf) * Residual Networks of Residual Networks: Multilevel Residual Networks, aug 2016, [arxiv](https://arxiv.org/pdf/1608.02908v1) * Context Matters: Refining Object Detection in Video with Recurrent Neural Networks, jul 2016, [arxiv](https://arxiv.org/pdf/1607.04648.pdf) * R-FCN: Object Detection via Region-based Fully Convolutional Networks, jun 2016, [arxiv](https://arxiv.org/pdf/1605.06409.pdf) * Training Region-based Object Detectors with Online Hard Example Mining, apr 2016, Facebook, [arxiv](https://arxiv.org/pdf/1604.03540v1) * Deep Residual Learning for Image Recognition, dec 2015, [arxiv](https://arxiv.org/pdf/1512.03385v1) * SSD: Single Shot MultiBox Detector, dec 2015, Google, [github](https://github.com/weiliu89/caffe/tree/ssd), [arxiv](https://arxiv.org/pdf/1512.02325v2) * ParseNet: Looking Wider to See Better, jun 2015, [arxiv](https://arxiv.org/pdf/1506.04579v2.pdf) * You Only Look Once: Unified, Real-Time Object Detection, jun 2015, Facebook, [arxiv](https://arxiv.org/pdf/1506.02640v5) * Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, jun 2015, Microsoft/Facebook [arxiv](https://arxiv.org/pdf/1506.01497v3.pdf) * Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, feb 2015, [arxiv](https://arxiv.org/pdf/1502.01852v1) * Deep Image: Scaling up Image Recognition, jul 2015, [arxiv](https://arxiv.org/vc/arxiv/papers/1501/1501.02876v1.pdf) * Rich feature hierarchies for accurate object detection and semantic segmentation, 2014, [paper](http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf) * Selective Search for Object Recognition, 2012, [paper](https://pdfs.semanticscholar.org/6a65/f9abad1022c7df2c75b819f48251aac23ae8.pdf) * ImageNet Classification with Deep Convolutional Neural Networks, 2012, [paper](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)  ## Image Segmentation * Mask R-CNN, mar 2017, [arxiv](https://arxiv.org/pdf/1703.06870) * Learning Features by Watching Objects Move, dec 2016, Facebook, [arxiv](https://arxiv.org/pdf/1612.06370.pdf)  * Fully Convolutional Networks for Semantic Segmentation, may 2016, [arxiv](https://arxiv.org/pdf/1605.06211v1) * Instance-aware Semantic Segmentation via Multi-task Network Cascades, dec 2015, [arxiv](https://arxiv.org/pdf/1512.04412.pdf) * SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, dec 2015, [arxiv](https://arxiv.org/pdf/1511.00561v2.pdf) * Learning Rich Features from RGB-D Images for Object Detection and Segmentation, jul 2014, [arxiv](https://arxiv.org/pdf/1407.5736v1) ## Pose Estimation * Fast Single Shot Detection and Pose Estimation, sep 2016, [arxiv](https://arxiv.org/pdf/1609.05590.pdf) ## Face Recognition * Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition, [paper](https://www.cs.cmu.edu/~sbhagava/papers/face-rec-ccs16.pdf) * OpenFace: A general-purpose face recognition library with mobile applications, June 2016, [paper](http://reports-archive.adm.cs.cmu.edu/anon/anon/usr0/ftp/2016/CMU-CS-16-118.pdf) * Deep Face Recognition, 2015, [paper](http://www.robots.ox.ac.uk:5000/~vgg/publications/2015/Parkhi15/parkhi15.pdf) * Compact Convolutional Neural Network Cascade for Face Detection, aug 2015, [arxiv](https://arxiv.org/pdf/1508.01292v3) * Learning Robust Deep Face Representation, Jul 2015, [arxiv](https://arxiv.org/pdf/1507.04844.pdf) * FaceNet: A Unified Embedding for Face Recognition and Clustering, jun 2015, [paper](http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Schroff_FaceNet_A_Unified_2015_CVPR_paper.pdf) * Multi-view Face Detection Using Deep Convolutional Neural Networks, yahoo, feb 2015, [arxiv](https://arxiv.org/pdf/1502.02766v3.pdf) ## Style Transfer * Deep Photo Style Transfer, mar 2017, [arxiv](https://arxiv.org/pdf/1703.07511) * A learned representation for artistic style, okt 2016, Google, [arxiv](https://arxiv.org/pdf/1610.07629v1.pdf), [demo](https://magenta.tensorflow.org/2016/11/01/multistyle-pastiche-generator/) * Fast Style Transfer in TensorFlow, [github](https://github.com/lengstrom/fast-style-transfer/) * Instance Normalization: The Missing Ingredient for Fast Stylization, sept 2016, [arxiv](https://arxiv.org/abs/1607.08022) * A Neural Algorithm of Artistic Style, sept 2015, [arxiv](https://arxiv.org/abs/1508.06576) * Perceptual Losses for Real-Time Style Transfer and Super-Resolution, [arxiv](https://arxiv.org/pdf/1603.08155v1), [github](http://github.com/jcjohnson/fast-neural-style) ## Logo Recognition * Deep Learning Logo Detection with Data Expansion by Synthesising Context, jan 2016, [arxiv](https://arxiv.org/pdf/1612.09322.pdf) * Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks, apr 2016, [arxiv](https://arxiv.org/pdf/1604.06083.pdf) * Logo Localization and Recognition in Natural Images Using Homographic Class Graphs, 2016, [paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.717.9015&rep=rep1&type=pdf) * LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks, nov 2015, [arxiv](https://arxiv.org/pdf/1511.02462.pdf) * DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer, okt 2015, Berkely, [arxiv](https://arxiv.org/pdf/1510.02131v1.pdf) * Automatic detection of logos in video and their removal using inpainting, jul 2015, [paper](https://github.com/sbrugman/deep-learning-papers/tree/master/papers/automatic-detection-of-logos-in-video-and-their-removal-using-inpainting.pdf) * On the Benefit of Synthetic Data for Company Logo Detection, 2015, [paper](http://www.multimedia-computing.de/mediawiki/images/c/cf/ACMMM2015.pdf) * Fast and Robust Realtime Storefront Logo Recognition, [paper](https://stacks.stanford.edu/file/druid:bf950qp8995/Liu.pdf) * Scalable Logo Recognition in Real-World Images, 2011, [paper](http://www.multimedia-computing.de/mediawiki/images/3/34/ICMR2011_Scalable_Logo_Recognition_in_Real-World_Images.pdf) * https://arxiv.org/pdf/1609.01414v1.pdf _note: also includes some papers that use SIFT_ ## Text (in the Wild) Recognition * COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images, jun 2016, [arxiv](https://arxiv.org/pdf/1601.07140v2.pdf) * Recursive Recurrent Nets with Attention Modeling for OCR in the Wild, mar 2016, [arxiv](https://arxiv.org/pdf/1603.03101v1)  * OCR Error Correction Using Character Correction and Feature-Based Word Classification, 2016, [arxiv](https://arxiv.org/pdf/1604.06225.pdf) * Efficient Scene Text Localization and Recognition with Local Character Refinement, apr 2015, [arxiv](https://arxiv.org/pdf/1504.03522.pdf) * Reading Text in the Wild with Convolutional Neural Networks, dec 2014, [arxiv](https://arxiv.org/pdf/1412.1842v1.pdf) * Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition, jun 2014, [arxiv](https://arxiv.org/pdf/1406.2227v4.pdf) * Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning, 2011, [paper](http://www.cs.stanford.edu/~acoates/papers/coatesetal_icdar_2011.pdf) ## Image / Video Description * Generation and Comprehension of Unambiguous Object Descriptions, apr 2016, [arxiv](https://arxiv.org/pdf/1511.02283v3.pdf) * Long-term Recurrent Convolutional Networks for Visual Recognition and Description, may 2016, [arxiv](https://arxiv.org/pdf/1411.4389v4) ## Detect key actor * Detecting events and key actors in multi-person videos, mar 2015, [arxiv](https://arxiv.org/pdf/1511.02917v2.pdf) ## Self-driving cars * End to End Learning for Self-Driving Cars, apr 2016, NVIDIA, [arxiv](https://arxiv.org/pdf/1604.07316.pdf) ## ConvNet visualization * Visualizing and Understanding Convolutional Networks, nov 2013, [arxiv](https://arxiv.org/pdf/1311.2901.pdf) ## Object part detection * Discovering the physical parts of an articulated object class from multiple videos, 2016, [paper](http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Del_Pero_Discovering_the_Physical_CVPR_2016_paper.pdf) ## Pedestrian Detection * Joint Deep Learning for Pedestrian Detection, 2013, [paper](http://www.ee.cuhk.edu.hk/~wlouyang/projects/ouyangWiccv13Joint/material/O&W_ICCV2013.pdf) ## Lip Reading * Lip Reading in the Wild, 2016, Oxford, [paper](https://github.com/sbrugman/deep-learning-papers/tree/master/papers/lip-reading-in-the-wild.pdf) ## Super Resolution * RAISR: Rapid and Accurate Image Super Resolution, okt 2016, Google, [arxiv](https://arxiv.org/pdf/1606.01299v3.pdf) * Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, sep 2016, Twitter, [arxiv](https://arxiv.org/pdf/1609.05158v2) ## Image Compression * Full Resolution Image Compression with Recurrent Neural Networks, aug 2016, Google [arxiv](https://arxiv.org/pdf/1608.05148v1.pdf) ## Automated Theorem Proving * DeepMath - Deep Sequence Models for Premise Selection, jun 2016, Google [arxiv](https://arxiv.org/pdf/1606.04442.pdf) ## Reverse Engineering * Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, okt 2016, [arxiv](https://arxiv.org/pdf/1610.05755v1) * Stealing Machine Learning Models via Prediction APIs, aug 2016, [paper](https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf) ## Language * Rationalizing Neural Predictions [github](https://github.com/taolei87/rcnn), [arxiv](https://arxiv.org/pdf/1606.04155v1) ## Translation * Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, nov 2016, Google, [arxiv](https://arxiv.org/pdf/1611.04558v1.pdf) * Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, okt 2016, Google [arxiv](https://arxiv.org/pdf/1609.08144v2.pdf) ## Age * Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker, dec 2016, [arxiv](https://arxiv.org/pdf/1612.02572v1)
gitextract_a3mjjkf_/ ├── .gitignore ├── README.md └── task-papers.md
Condensed preview — 3 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (51K chars).
[
{
"path": ".gitignore",
"chars": 8,
"preview": ".idea/*\n"
},
{
"path": "README.md",
"chars": 38475,
"preview": "# Deep Learning Papers by task\nPapers about deep learning ordered by task, date. For each paper there is a permanent lin"
},
{
"path": "task-papers.md",
"chars": 12089,
"preview": "# Deep Learning Papers by task\nPapers about deep learning ordered by task, date. Current state-of-the-art papers and pap"
}
]
About this extraction
This page contains the full source code of the sbrugman/deep-learning-papers GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 3 files (49.4 KB), approximately 15.7k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.