Repository: sbrugman/deep-learning-papers
Branch: master
Commit: 358e237239de
Files: 3
Total size: 49.4 KB

Directory structure:
gitextract_a3mjjkf_/

├── .gitignore
├── README.md
└── task-papers.md

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
.idea/*


================================================
FILE: README.md
================================================
# Deep Learning Papers by task
Papers about deep learning ordered by task, date. For each paper there is a permanent link, which is either to Arxiv.org or to a copy of the original paper in this repository.
# Table of Contents
1. [Code](#code)

	1.1. [Code Generation](#code-generation)

	1.2. [Malware Detection and Security](#malware-detection-and-security)

2. [Text](#text)

	2.1. [Summarization](#summarization)

	2.2. [Taskbots](#taskbots)

	2.3. [Classification](#classification)

	2.4. [Question Answering](#question-answering)

	2.5. [Sentiment Analysis](#sentiment-analysis)

	2.6. [Translation](#translation)

	2.7. [Chatbots](#chatbots)

	2.8. [Reasoning](#reasoning)
	
	2.9. [Language Representation](#language-representation)

3. [Visual](#visual)

	3.1. [Gaming](#gaming)

	3.2. [Style Transfer](#style-transfer)

	3.3. [Object Tracking](#object-tracking)

	3.4. [Visual Question Answering](#visual-question-answering)

	3.5. [Image Segmentation](#image-segmentation)

	3.6. [Text (in the Wild) Recognition](#text-in-the-wild-recognition)

	3.7. [Brain Computer Interfacing](#brain-computer-interfacing)

	3.8. [Self-Driving Cars](#self-driving-cars)

	3.9. [Object Recognition](#object-recognition)

	3.10. [Logo Recognition](#logo-recognition)

	3.11. [Super Resolution](#super-resolution)

	3.12. [Pose Estimation](#pose-estimation)

	3.13. [Image Captioning](#image-captioning)

	3.14. [Image Compression](#image-compression)

	3.15. [Image Synthesis](#image-synthesis)

	3.16. [Face Recognition](#face-recognition)
	
	3.17. [Image Composition](#image-composition)
	
	3.18. [Scene Graph Parsing](#scene-graph-parsing)
	
	3.19. [Video Deblurring](#video-deblurring)
	
	3.20. [Depth Perception](#depth-perception)

	3.21. [3D Reconstruction](#3d-reconstruction)
	
	3.22. [Vision Representation](#vision-representation)


4. [Audio](#audio)

	4.1. [Audio Synthesis](#audio-synthesis)

5. [Other](#other)

	5.1. [Unclassified](#unclassified)

	5.2. [Regularization](#regularization)

	5.3. [Neural Network Compression](#neural-network-compression)

	5.4. [Optimizers](#optimizers)

## Code
### Code Generation

|Title|Date|Paper|Code|
|---|---|---|---|
| DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07734) |  | 
| A Syntactic Neural Model for General-Purpose Code Generation | _6 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.01696) |  | 
| RobustFill: Neural Program Learning under Noisy I/O | _21 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.07469) |  | 
| DeepFix: Fixing Common C Language Errors by Deep Learning | _12 feb 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deepfix-fixing-common-c-language-errors-by-deep-learning.pdf) |  | 
| DeepCoder: Learning to Write Programs | _7 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01989) |  | 
| Neuro-Symbolic Program Synthesis | _6 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01855) |  | 
| Deep API Learning | _27 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.08535) |  | 
### Malware Detection and Security

|Title|Date|Paper|Code|
|---|---|---|---|
| PassGAN: A Deep Learning Approach for Password Guessing | _1 sep 2017_ | [arxiv](https://arxiv.org/pdf/1709.00440) |  | 
| Deep Android Malware Detection | _22 mar 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-android-malware-detection.pdf) | [github](https://github.com/niallmcl/Deep-Android-Malware-Detection) | 
| Droid-Sec: Deep Learning in Android Malware Detection | _17 aug 2014_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/droid-sec-deep-learning-in-android-malware-detection.pdf) | [github](https://github.com/pjlantz/droidbox) | 
## Text
### Summarization

|Title|Date|Paper|Code|
|---|---|---|---|
| A Deep Reinforced Model for Abstractive Summarization | _11 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.04304) |  | 
| Get To The Point: Summarization with Pointer-Generator Networks | _14 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.04368) |  | 
| SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents | _14 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.04230) |  | 
### Taskbots

|Title|Date|Paper|Code|
|---|---|---|---|
| Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning | _10 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03084) | [github](https://github.com/MiuLab/TC-Bot) | 
| End-to-End Task-Completion Neural Dialogue Systems | _3 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.01008) | [github](https://github.com/MiuLab/TC-Bot) | 
### Classification

|Title|Date|Paper|Code|
|---|---|---|---|
| A Large Self-Annotated Corpus for Sarcasm | _19 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05579) |  | 
| ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03560) |  | 
| Bilateral Multi-Perspective Matching for Natural Language Sentences | _13 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.03814) |  | 
| FastText.zip: Compressing text classification models | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03651) |  | 
| ConceptNet 5.5: An Open Multilingual Graph of General Knowledge | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03975) |  | 
| A Simple but Tough-to-Beat Baseline for Sentence Embeddings | _4 nov 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/a-simple-but-tough-to-beat-baseline-for-sentence-embeddings.pdf) | [github](https://github.com/YingyuLiang/SIF) | 
| Enriching Word Vectors with Subword Information | _15 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.04606) |  | 
| From Word Embeddings To Document Distances | _6 jul 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/from-word-embeddings-to-document-distances.pdf) | [github](https://github.com/mkusner/wmd) | 
| Bag of Tricks for Efficient Text Classification | _6 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.01759) |  | 
| Character-level Convolutional Networks for Text Classification | _4 sep 2015_ | [arxiv](https://arxiv.org/pdf/1509.01626) |  | 
| GloVe: Global Vectors for Word Representation | _25 may 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/glove-global-vectors-for-word-representation.pdf) | [github](https://github.com/stanfordnlp/GloVe) | 
| Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks | _28 feb 2015_ | [arxiv](https://arxiv.org/pdf/1503.00075) |  | 
| Distributed Representations of Sentences and Documents | _16 may 2014_ | [arxiv](https://arxiv.org/pdf/1405.4053) |  | 
| Efficient Estimation of Word Representations in Vector Space | _16 jan 2013_ | [arxiv](https://arxiv.org/pdf/1301.3781) |  | 
| SimHash: Hash-based Similarity Detection | _13 dec 2007_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/simhash-hash-based-similarity-detection.pdf) |  | 
### Question Answering

|Title|Date|Paper|Code|
|---|---|---|---|
| IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models | _30 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.10513) | [github](https://github.com/geek-ai/irgan) | 
### Sentiment Analysis

|Title|Date|Paper|Code|
|---|---|---|---|
| Rationalizing Neural Predictions | _13 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.04155) |  | 
| Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank | _18 okt 2013_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/recursive-deep-models-for-semantic-compositionality-over-a-sentiment-treebank.pdf) |  | 
### Translation

|Title|Date|Paper|Code|
|---|---|---|---|
| Attention Is All You Need | _12 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.03762) |  | 
| Convolutional Sequence to Sequence Learning | _8 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.03122) | [github](https://github.com/facebookresearch/fairseq) | 
| Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation | _14 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.04558) |  | 
| A Convolutional Encoder Model for Neural Machine Translation | _7 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.02344) |  | 
| Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation | _26 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.08144) |  | 
| Neural Machine Translation by Jointly Learning to Align and Translate | _1 sep 2014_ | [arxiv](https://arxiv.org/pdf/1409.0473) |  | 
### Chatbots

|Title|Date|Paper|Code|
|---|---|---|---|
| A Deep Reinforcement Learning Chatbot | _7 sep 2017_ | [arxiv](https://arxiv.org/pdf/1709.02349) |  | 
| A Neural Conversational Model | _19 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.05869) | [github](https://github.com/inikdom/neural-chatbot) | 
### Reasoning

|Title|Date|Paper|Code|
|---|---|---|---|
| NeuroSAT: Learning a SAT Solver from Single-Bit Supervision | _5 jan 2019_ | [arxiv](https://arxiv.org/pdf/1802.03685.pdf) | [github](https://github.com/dselsam/neurosat) |
| Tracking the World State with Recurrent Entity Networks | _12 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03969) |  |

### Language Representation

|Title|Date|Paper|Code|
|---|---|---|---|
| Efficient Estimation of Word Representations in Vector Space | _7 sep 2013_ | [arxiv](https://arxiv.org/pdf/1301.3781.pdf) |  |
| Distributed Representations of Words and Phrases and their Compositionality | _16 okt 2013_ | [arxiv](https://arxiv.org/pdf/1310.4546.pdf) | 
| ELMO: Deep contextualized word representations | _22 Mar 2018_ | [arxiv](https://arxiv.org/pdf/1802.05365.pdf) | |
| BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | _24 may 2019_ | [arxiv](https://arxiv.org/pdf/1810.04805.pdf) | [github](https://github.com/google-research/bert) |
| XLNet: Generalized Autoregressive Pretraining for Language Understanding | _19 jun 2019_ | [arxiv](https://arxiv.org/pdf/1906.08237.pdf) | [github](https://github.com/zihangdai/xlnet) |
| RoBERTa: A Robustly Optimized BERT Pretraining Approach | _26 jul 2019_ | [arxiv](https://arxiv.org/pdf/1907.11692.pdf) | [github](https://github.com/pytorch/fairseq/tree/master/examples/roberta) |

## Visual
### Gaming

|Title|Date|Paper|Code|
|---|---|---|---|
| Phase-Functioned Neural Networks for Character Control | _1 may 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/phase-functioned-neural-networks-for-character-control.pdf) |  | 
| Equivalence Between Policy Gradients and Soft Q-Learning | _21 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.06440) |  | 
| Beating Atari with Natural Language Guided Reinforcement Learning | _18 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05539) |  | 
| Learning from Demonstrations for Real World Reinforcement Learning | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03732) |  | 
| FeUdal Networks for Hierarchical Reinforcement Learning | _3 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.01161) |  | 
| Overcoming catastrophic forgetting in neural networks | _2 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00796) |  | 
| Playing Doom with SLAM-Augmented Deep Reinforcement Learning | _1 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00380) |  | 
| Playing FPS Games with Deep Reinforcement Learning | _18 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05521) |  | 
| DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess | _16 aug 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deepchess-end-to-end-deep-neural-network-for-automatic-learning-in-chess.pdf) |  | 
| Generative Adversarial Imitation Learning | _10 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.03476) |  | 
| Dueling Network Architectures for Deep Reinforcement Learning | _20 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.06581) |  | 
| Prioritized Experience Replay | _18 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.05952) |  | 
| Human-level control through deep reinforcement learning | _26 feb 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/human-level-control-through-deep-reinforcement-learning.pdf) |  | 
| Playing Atari with Deep Reinforcement Learning | _19 dec 2013_ | [arxiv](https://arxiv.org/pdf/1312.5602) |  | 
### Style Transfer

|Title|Date|Paper|Code|
|---|---|---|---|
| The Contextual Loss for Image Transformation with Non-Aligned Data | _18 jul 2018_ | [arxiv](https://arxiv.org/pdf/1803.02077.pdf) | [github](https://github.com/roimehrez/contextualLoss) |
| Deep Photo Style Transfer | _22 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.07511) |  |
| Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization | _20 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06868) | [github](https://github.com/xunhuang1995/AdaIN-style) | 
| A Learned Representation For Artistic Style | _24 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.07629) |  | 
| Instance Normalization: The Missing Ingredient for Fast Stylization | _27 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.08022) |  | 
| Perceptual Losses for Real-Time Style Transfer and Super-Resolution | _27 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.08155) | [github](http://github.com/jcjohnson/fast-neural-style) | 
| A Neural Algorithm of Artistic Style | _26 aug 2015_ | [arxiv](https://arxiv.org/pdf/1508.06576) | [github](https://github.com/lengstrom/fast-style-transfer/) |  
### Object Tracking

|Title|Date|Paper|Code|
|---|---|---|---|
| End-to-end representation learning for Correlation Filter based tracking | _20 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.06036) | [github](https://github.com/bertinetto/cfnet) | 
### Visual Question Answering

|Title|Date|Paper|Code|
|---|---|---|---|
| VQA: Visual Question Answering | _3 may 2015_ | [arxiv](https://arxiv.org/pdf/1505.00468) |  | 
### Image Segmentation

|Title|Date|Paper|Code|
|---|---|---|---|
| PointRend: Image Segmentation as Rendering | _17 dec 2019_ | [arxiv](https://arxiv.org/pdf/1912.08193.pdf) | |
| Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation | _22 aug 2018_ | [paper](https://arxiv.org/pdf/1802.02611.pdf) [cvpr](http://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Auto-DeepLab_Hierarchical_Neural_Architecture_Search_for_Semantic_Image_Segmentation_CVPR_2019_paper.pdf) | [github](https://github.com/tensorflow/models/tree/master/research/deeplab) |
| Dilated Residual Networks | _22 jul 2017_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/dilated-residual-networks.pdf) |  | 
| SfM-Net: Learning of Structure and Motion from Video | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07804) |  | 
| Semi and Weakly Supervised Semantic Segmentation Using Generative Adversarial Network | _28 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.09695) |  | 
| Mask R-CNN | _20 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06870) |  | 
| Learning Features by Watching Objects Move | _19 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.06370) |  | 
| RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation | _20 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.06612) | [github](https://github.com/guosheng/refinenet) | 
| UberNet: Training a `Universal' Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory | _7 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.02132) |  | 
| DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs | _2 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.00915) |  | 
| Fully Convolutional Networks for Semantic Segmentation | _20 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.06211) | [github](https://github.com/shelhamer/fcn.berkeleyvision.org) | 
| Instance-aware Semantic Segmentation via Multi-task Network Cascades | _14 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.04412) |  | 
| Multi-Scale Context Aggregation by Dilated Convolutions | _23 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.07122) |  | 
| SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation | _2 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.00561) |  | 
| U-Net: Convolutional Networks for Biomedical Image Segmentation | _18 may 2015_ | [arxiv](https://arxiv.org/pdf/1505.04597) |  | 
| Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs | _22 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.7062) |  | 
| Learning Rich Features from RGB-D Images for Object Detection and Segmentation | _22 jul 2014_ | [arxiv](https://arxiv.org/pdf/1407.5736) |  | 

### Text (in the Wild) Recognition

|Title|Date|Paper|Code|
|---|---|---|---|
| OCR Error Correction Using Character Correction and Feature-Based Word Classification | _21 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.06225) |  | 
| Recursive Recurrent Nets with Attention Modeling for OCR in the Wild | _9 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.03101) |  | 
| COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images | _26 jan 2016_ | [arxiv](https://arxiv.org/pdf/1601.07140) |  | 
| Efficient Scene Text Localization and Recognition with Local Character Refinement | _14 apr 2015_ | [arxiv](https://arxiv.org/pdf/1504.03522) |  | 
| Reading Text in the Wild with Convolutional Neural Networks | _4 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.1842) |  | 
| Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition | _9 jun 2014_ | [arxiv](https://arxiv.org/pdf/1406.2227) |  | 
### Brain Computer Interfacing

|Title|Date|Paper|Code|
|---|---|---|---|
| Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG | _15 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.05051) |  | 
| Encoding Voxels with Deep Learning | _2 dec 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/encoding-voxels-with-deep-learning.pdf) |  | 
| Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream | _8 jul 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-neural-networks-reveal-a-gradient-in-the-complexity-of-neural-representations-across-the-ventral-stream.pdf) |  | 
### Self-Driving Cars

|Title|Date|Paper|Code|
|---|---|---|---|
| Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art | _18 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05519) |  | 
| End to End Learning for Self-Driving Cars | _25 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.07316) |  | 
### Object Recognition

|Title|Date|Paper|Code|
|---|---|---|---|
| Cascade R-CNN: High Quality Object Detection and Instance Segmentation | _24 Jun 2019_ | [arxiv](https://arxiv.org/pdf/1906.09756.pdf) | [github](https://github.com/zhaoweicai/cascade-rcnn) |
| YOLOv3: An Incremental Improvement | _8 Apr 2018_ | [arxiv](https://arxiv.org/pdf/1804.02767.pdf) | [github]( https://github.com/pjreddie/darknet), [github reimplementation](https://github.com/ultralytics/yolov3) | 
| Focal Loss for Dense Object Detection | _7 aug 2017_ | [arxiv](https://arxiv.org/pdf/1708.02002) |  | 
| Introspective Classifier Learning: Empower Generatively | _25 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07816) |  | 
| Learning Chained Deep Features and Classifiers for Cascade in Object Detection | _23 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.07054) |  | 
| DSSD : Deconvolutional Single Shot Detector | _23 jan 2017_ | [arxiv](https://arxiv.org/pdf/1701.06659) |  |  
| YOLO9000: Better, Faster, Stronger | _25 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.08242) | [github](https://github.com/pjreddie/darknet) |  
| Feature Pyramid Networks for Object Detection | _9 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.03144) |  | 
| Speed/accuracy trade-offs for modern convolutional object detectors | _30 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.10012) |  | 
| Aggregated Residual Transformations for Deep Neural Networks | _16 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.05431) |  | 
| Aggregated Residual Transformations for Deep Neural Networks | _16 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.05431) |  | 
| Hierarchical Object Detection with Deep Reinforcement Learning | _11 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.03718) |  | 
| Xception: Deep Learning with Depthwise Separable Convolutions | _7 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.02357) |  | 
| Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition | _1 okt 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/learning-to-make-better-mistakes-semantics-aware-visual-food-recognition.pdf) |  | 
| Densely Connected Convolutional Networks | _25 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.06993) |  | 
| Residual Networks of Residual Networks: Multilevel Residual Networks | _9 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.02908) |  | 
| Context Matters: Refining Object Detection in Video with Recurrent Neural Networks | _15 jul 2016_ | [arxiv](https://arxiv.org/pdf/1607.04648) |  | 
| R-FCN: Object Detection via Region-based Fully Convolutional Networks | _20 may 2016_ | [arxiv](https://arxiv.org/pdf/1605.06409) |  | 
| Training Region-based Object Detectors with Online Hard Example Mining | _12 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.03540) |  | 
| T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos | _9 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.02532) |  | 
| Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning | _23 feb 2016_ | [arxiv](https://arxiv.org/pdf/1602.07261) |  | 
| Deep Residual Learning for Image Recognition | _10 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.03385) |  | 
| SSD: Single Shot MultiBox Detector | _8 dec 2015_ | [arxiv](https://arxiv.org/pdf/1512.02325) |  |  
| Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) | _23 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.07289) |  | 
| ParseNet: Looking Wider to See Better | _15 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.04579) |  | 
| You Only Look Once: Unified, Real-Time Object Detection | _8 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02640) |  |  
| Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | _4 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.01497) |  |  
| Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification | _6 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.01852) |  | 
| Deep Image: Scaling up Image Recognition | _13 jan 2015_ | [arxiv](https://arxiv.org/pdf/1501.02876) |  | 
| Rich feature hierarchies for accurate object detection and semantic segmentation | _11 nov 2013_ | [arxiv](https://arxiv.org/pdf/1311.2524) |  | 
| Selective Search for Object Recognition | _11 mar 2013_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/selective-search-for-object-recognition.pdf) |  | 
| ImageNet Classification with Deep Convolutional Neural Networks | _3 dec 2012_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/imagenet-classification-with-deep-convolutional-neural-networks.pdf) |  |  
### Logo Recognition

|Title|Date|Paper|Code|
|---|---|---|---|
| Deep Learning Logo Detection with Data Expansion by Synthesising Context | _29 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.09322) |  | 
| Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks | _20 apr 2016_ | [arxiv](https://arxiv.org/pdf/1604.06083) |  | 
| LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks | _8 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.02462) |  | 
| DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer | _7 okt 2015_ | [arxiv](https://arxiv.org/pdf/1510.02131) |  | 
### Super Resolution

|Title|Date|Paper|Code|
|---|---|---|---|
| Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network | _16 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05158) |  | 
| Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network | _15 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.04802) |  | 
| RAISR: Rapid and Accurate Image Super Resolution | _3 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.01299) |  | 
| Perceptual Losses for Real-Time Style Transfer and Super-Resolution | _27 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.08155) | [github](http://github.com/jcjohnson/fast-neural-style) | 
| Image Super-Resolution Using Deep Convolutional Networks | _31 dec 2014_ | [arxiv](https://arxiv.org/pdf/1501.00092) |  | 
### Pose Estimation

|Title|Date|Paper|Code|
|---|---|---|---|
| Forecasting Human Dynamics from Static Images | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03432) |  | 
| Fast Single Shot Detection and Pose Estimation | _19 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.05590) |  | 
| Convolutional Pose Machines | _30 jan 2016_ | [arxiv](https://arxiv.org/pdf/1602.00134) |  | 
| Flowing ConvNets for Human Pose Estimation in Videos | _9 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02897) |  | 
### Image Captioning

|Title|Date|Paper|Code|
|---|---|---|---|
| Actor-Critic Sequence Training for Image Captioning | _29 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.09601) |  | 
| Detecting and Recognizing Human-Object Interactions | _24 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.07333) |  | 
| Deep Reinforcement Learning-based Image Captioning with Embedding Reward | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03899) |  | 
| Towards Diverse and Natural Image Descriptions via a Conditional GAN | _17 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.06029) |  | 
| Temporal Tessellation: A Unified Approach for Video Analysis | _21 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.06950) | [github](https://github.com/dot27/temporal-tessellation) | 
| Self-critical Sequence Training for Image Captioning | _2 dec 2016_ | [arxiv](https://arxiv.org/pdf/1612.00563) |  | 
| Generation and Comprehension of Unambiguous Object Descriptions | _7 nov 2015_ | [arxiv](https://arxiv.org/pdf/1511.02283) |  | 
| Show, Attend and Tell: Neural Image Caption Generation with Visual Attention | _10 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.03044) |  | 
| Long-term Recurrent Convolutional Networks for Visual Recognition and Description | _17 nov 2014_ | [arxiv](https://arxiv.org/pdf/1411.4389) |  | 
### Image Compression

|Title|Date|Paper|Code|
|---|---|---|---|
| Full Resolution Image Compression with Recurrent Neural Networks | _18 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.05148) |  | 
### Image Synthesis

|Title|Date|Paper|Code|
|---|---|---|---|
| Scene Text Synthesis for Efficient and Effective Deep Network Training | _26 jan 2019_ | [arxiv](https://arxiv.org/pdf/1901.09193.pdf) | |
| A Neural Representation of Sketch Drawings | _11 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03477) |  | 
| BEGAN: Boundary Equilibrium Generative Adversarial Networks | _31 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10717) | [github](https://github.com/carpedm20/BEGAN-tensorflow) | 
| Improved Training of Wasserstein GANs | _31 mar 2017_ | [arxiv](https://arxiv.org/pdf/1704.00028) |  | 
| Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks | _30 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10593) | [github](https://github.com/junyanz/CycleGAN) | 
| Wasserstein GAN | _26 jan 2017_ | [arxiv](https://arxiv.org/pdf/1701.07875) |  | 
| RenderGAN: Generating Realistic Labeled Data | _4 nov 2016_ | [arxiv](https://arxiv.org/pdf/1611.01331) |  | 
| Conditional Image Generation with PixelCNN Decoders | _16 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.05328) |  | 
| Pixel Recurrent Neural Networks | _25 jan 2016_ | [arxiv](https://arxiv.org/pdf/1601.06759) |  | 
| Generative Adversarial Networks | _10 jun 2014_ | [arxiv](https://arxiv.org/pdf/1406.2661) |  | 
### Face Recognition

|Title|Date|Paper|Code|
|---|---|---|---|
| Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition | _24 okt 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/accessorize-to-a-crime-real-and-stealthy-attacks-on-state-of-the-art-face-recognition.pdf) |  | 
| OpenFace: A general-purpose face recognition library with mobile applications | _1 jun 2016_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/openface-a-general-purpose-face-recognition-library-with-mobile-applications.pdf) |  | 
| Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns | _9 nov 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/emotion-recognition-in-the-wild-via-convolutional-neural-networks-and-mapped-binary-patterns.pdf) |  | 
| Deep Face Recognition | _7 sep 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/deep-face-recognition.pdf) |  | 
| Compact Convolutional Neural Network Cascade for Face Detection | _6 aug 2015_ | [arxiv](https://arxiv.org/pdf/1508.01292) |  | 
| Learning Robust Deep Face Representation | _17 jul 2015_ | [arxiv](https://arxiv.org/pdf/1507.04844) |  | 
| Facenet: A unified embedding for face recognition and clustering | _12 jun 2015_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/facenet-a-unified-embedding-for-face-recognition-and-clustering.pdf) |  | 
| Multi-view Face Detection Using Deep Convolutional Neural Networks | _10 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.02766) |  | 

### Image Composition

|Title|Date|Paper|Code|
|---|---|---|---|
| Auto-Retoucher(ART) — A Framework for Background Replacement and Foreground Adjustment | _13 jan 2019_ | [arxiv](https://arxiv.org/pdf/1901.03954.pdf) (brave new task) | [github](https://github.com/woshiyyya/Auto-Retoucher-pytorch) (not able to reproduce results based on code) | 
| Spatial Fusion GAN for Image Synthesis | _14 Dec 2018_ | [arxiv](https://arxiv.org/pdf/1812.05840.pdf) (needs revision, interesting approach however) | [github](https://github.com/fnzhan/SF-GAN) (currently, no code available) | 
| Compositional GAN: Learning Conditional Image Composition | _23 Aug 2018_ | [arxiv](https://arxiv.org/pdf/1807.07560.pdf) (with respect to spatial orientation) | [github](https://github.com/azadis/CompositionalGAN) (currently, no code available) | 
| ST-GAN | _5 mar 2018_ | [arxiv](https://arxiv.org/pdf/1803.01837) (with respect to spatial orientation) | [github](https://github.com/chenhsuanlin/spatial-transformer-GAN)  | 
| Deep Painterly Harmonization | _26 Jun 2018_ | [paper](https://arxiv.org/pdf/1804.03189.pdf) | [github](https://github.com/luanfujun/deep-painterly-harmonization) | 
| Deep Image Harmonization | _28 feb 2017_ | [paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Tsai_Deep_Image_Harmonization_CVPR_2017_paper.pdf) | [github](https://github.com/wasidennis/DeepHarmonization) (only code for inference) | 
| Understanding and Improving the Realism of Image Composites | _1 Jul 2012_ | [paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.682.3987&rep=rep1&type=pdf) | |


### Scene Graph Parsing

|Title|Date|Paper|Code|
|---|---|---|---|
| Neural Motifs: Scene Graph Parsing with Global Context | _29 Mar 2018_ | [arxiv](https://arxiv.org/pdf/1711.06640.pdf) | [github](https://github.com/rowanz/neural-motifs) | 


### Video Deblurring

|Title|Date|Paper|Code|
|---|---|---|---|
| Spatio-Temporal Filter Adaptive Network for Video Deblurring | _28 Apr 2019_ | [arxiv](https://arxiv.org/pdf/1904.12257.pdf) | [github](https://shangchenzhou.com/projects/stfan/) (to appear) | 

### Depth Perception

|Title|Date|Paper|Code|
|---|---|---|---|
| Learning Depth with Convolutional Spatial Propagation Network | _13 Okt 2018_ | [arxiv](https://arxiv.org/pdf/1810.02695.pdf) | [github](https://github.com/XinJCheng/CSPN) |
| Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches | _18 May 2016_ | [arxiv](https://arxiv.org/pdf/1510.05970.pdf) | [github](https://github.com/jzbontar/mc-cnn) | 

### 3D Reconstruction

|Title|Date|Paper|Code|
|---|---|---|---|
| Cerberus: A Multi-headed Derenderer | _28 May 2019_ | [arxiv](https://arxiv.org/pdf/1905.11940.pdf) |  |


### Vision Representation

|Title|Date|Paper|Code|
|---|---|---|---|
| VisualBERT: A Simple and Performant Baseline for Vision and Language | _9 aug 2019_ | [arxiv](https://arxiv.org/pdf/1908.03557.pdf) | |
| Expected to appear: some paper learning an unsupervised vision representation that beats SOTA on a large number of tasks | <Before the end of 2019> | | | 


## Audio
### Audio Synthesis

|Title|Date|Paper|Code|
|---|---|---|---|
| Deep Cross-Modal Audio-Visual Generation | _26 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.08292) |  | 
| A Neural Parametric Singing Synthesizer | _12 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.03809) |  | 
| Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders | _5 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.01279) | [github](https://github.com/tensorflow/magenta/tree/master/magenta/models/nsynth) | 
| Tacotron: Towards End-to-End Speech Synthesis | _29 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.10135) | [github](https://github.com/Kyubyong/tacotron) | 
| Deep Voice: Real-time Neural Text-to-Speech | _25 feb 2017_ | [arxiv](https://arxiv.org/pdf/1702.07825) |  | 
| WaveNet: A Generative Model for Raw Audio | _12 sep 2016_ | [arxiv](https://arxiv.org/pdf/1609.03499) | [github](https://github.com/ibab/tensorflow-wavenet) |  
## Other
### Unclassified

|Title|Date|Paper|Code|
|---|---|---|---|
| A simple neural network module for relational reasoning | _5 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.01427) |  | 
| Deep Complex Networks | _27 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.09792) | [github](https://github.com/ChihebTrabelsi/deep_complex_networks) | 
| Learning to Fly by Crashing | _19 apr 2017_ | [arxiv](https://arxiv.org/pdf/1704.05588) |  | 
| Who Said What: Modeling Individual Labelers Improves Classification | _26 mar 2017_ | [arxiv](https://arxiv.org/pdf/1703.08774) |  | 
| Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data | _18 okt 2016_ | [arxiv](https://arxiv.org/pdf/1610.05755) |  | 
| DeepMath - Deep Sequence Models for Premise Selection | _14 jun 2016_ | [arxiv](https://arxiv.org/pdf/1606.04442) |  | 
| Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue | _16 mar 2016_ | [arxiv](https://arxiv.org/pdf/1603.04992) |  | 
| Long Short-Term Memory | _15 nov 1997_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/long-short-term-memory.pdf) |  | 
### Regularization

|Title|Date|Paper|Code|
|---|---|---|---|
| Self-Normalizing Neural Networks | _8 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.02515) |  | 
| Concrete Dropout | _22 may 2017_ | [arxiv](https://arxiv.org/pdf/1705.07832) | [github](https://github.com/yaringal/ConcreteDropout) | 
| Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning | _6 jun 2015_ | [arxiv](https://arxiv.org/pdf/1506.02142) |  | 
| Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift | _11 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.03167) |  | 
### Neural Network Compression

|Title|Date|Paper|Code|
|---|---|---|---|
| Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial "Bottleneck" Structure | _15 aug 2016_ | [arxiv](https://arxiv.org/pdf/1608.04337) |  | 
| SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;0.5MB model size | _24 feb 2016_ | [arxiv](https://arxiv.org/pdf/1602.07360) |  | 
| Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding | _1 okt 2015_ | [arxiv](https://arxiv.org/pdf/1510.00149) |  | 
### Optimizers

|Title|Date|Paper|Code|
|---|---|---|---|
| Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour | _8 jun 2017_ | [arxiv](https://arxiv.org/pdf/1706.02677) |  | 
| Equilibrated adaptive learning rates for non-convex optimization | _15 feb 2015_ | [arxiv](https://arxiv.org/pdf/1502.04390) |  | 
| Adam: A Method for Stochastic Optimization | _22 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.6980) |  | 
| Deep learning with Elastic Averaging SGD | _20 dec 2014_ | [arxiv](https://arxiv.org/pdf/1412.6651) |  | 
| ADADELTA: An Adaptive Learning Rate Method | _22 dec 2012_ | [arxiv](https://arxiv.org/pdf/1212.5701) |  | 
| Advances in Optimizing Recurrent Networks | _4 dec 2012_ | [arxiv](https://arxiv.org/pdf/1212.0901) |  | 
| Efficient Backprop | _1 jul 1998_ | [paper](https://github.com/sbrugman/deep-learning-papers/raw/master/papers/efficient-backprop.pdf) |  |  

## A note on arXiv

arXiv provides the world with access to the newest scientific developments.
Open Access has a myriad of benefits, in particular, it allows science to be more efficient.
Remember to think about the quality of the papers referenced.
In particular, the importance of the [peer-review process](https://undsci.berkeley.edu/article/howscienceworks_16) for science.  
If you find an article on arXiv you should check if it has been peer-reviewed and published elsewhere. 
The authoritative version of the paper is not the version on arXiv, rather it is the published peer-reviewed version. 
The two versions may differ significantly. 

For example, this is the case with one of the papers that I once discussed in the Text and Multimedia Mining class at Radboud:
- [peer-reviewed version](http://opus.bath.ac.uk/55288/4/CaliskanEtAl_authors_full.pdf)
- [arXiv version](https://arxiv.org/abs/1608.07187)
Compare for yourself.

For the selection of the papers above, I choose open access over completeness.
If you find another (open) version of a paper, you are invited to make a pull request.


================================================
FILE: task-papers.md
================================================
# Deep Learning Papers by task
Papers about deep learning ordered by task, date. Current state-of-the-art papers and papers useful for getting started are labelled.

## Object Recognition
* YOLO9000: Better, Faster, Stronger, dec 2016, [arxiv](https://arxiv.org/pdf/1612.08242v1) ![state-of-the-art](https://img.shields.io/badge/style-state_of_the_art-green.svg?style=flat&label=paper)
* Feature Pyramid Networks for Object Detection, dec 2016, [arxiv](https://arxiv.org/pdf/1612.03144.pdf)
* Densely Connected Convolutional Networks, dec 2016, [arxiv](https://arxiv.org/pdf/1608.06993.pdf)
* PVANet: Lightweight Deep Neural Networks for Real-time Object Detection, dec 2016, [arxiv](https://arxiv.org/pdf/1611.08588.pdf)
* Aggregated Residual Transformations for Deep Neural Networks, nov 2016, [arxiv](https://arxiv.org/pdf/1611.05431v1.pdf)
* Hierarchical Object Detection with Deep Reinforcement Learning, nov 2016, [arxiv](https://arxiv.org/pdf/1611.03718v1.pdf)
* Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition, okt 2016, IBM, [paper](https://github.com/sbrugman/deep-learning-papers/blob/master/papers/learning-to-make-better-mistakes-semantics-aware-visual-food-recognition.pdf)
* T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos, aug 2016, [github](https://github.com/myfavouritekk/T-CNN), [arxiv](https://arxiv.org/pdf/1604.02532v3)
* Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, aug 2016, Google, [arxiv](https://arxiv.org/pdf/1602.07261.pdf)
* Residual Networks of Residual Networks: Multilevel Residual Networks, aug 2016, [arxiv](https://arxiv.org/pdf/1608.02908v1)
* Context Matters: Refining Object Detection in Video with Recurrent Neural Networks, jul 2016, [arxiv](https://arxiv.org/pdf/1607.04648.pdf)
* R-FCN: Object Detection via Region-based Fully Convolutional Networks, jun 2016, [arxiv](https://arxiv.org/pdf/1605.06409.pdf)
* Training Region-based Object Detectors with Online Hard Example Mining, apr 2016, Facebook, [arxiv](https://arxiv.org/pdf/1604.03540v1)
* Deep Residual Learning for Image Recognition, dec 2015, [arxiv](https://arxiv.org/pdf/1512.03385v1)
* SSD: Single Shot MultiBox Detector, dec 2015, Google, [github](https://github.com/weiliu89/caffe/tree/ssd), [arxiv](https://arxiv.org/pdf/1512.02325v2)
* ParseNet: Looking Wider to See Better, jun 2015, [arxiv](https://arxiv.org/pdf/1506.04579v2.pdf)
* You Only Look Once: Unified, Real-Time Object Detection, jun 2015, Facebook, [arxiv](https://arxiv.org/pdf/1506.02640v5)
* Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, jun 2015, Microsoft/Facebook [arxiv](https://arxiv.org/pdf/1506.01497v3.pdf) 
* Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, feb 2015, [arxiv](https://arxiv.org/pdf/1502.01852v1)
* Deep Image: Scaling up Image Recognition, jul 2015, [arxiv](https://arxiv.org/vc/arxiv/papers/1501/1501.02876v1.pdf)
* Rich feature hierarchies for accurate object detection and semantic segmentation, 2014, [paper](http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf)
* Selective Search for Object Recognition, 2012, [paper](https://pdfs.semanticscholar.org/6a65/f9abad1022c7df2c75b819f48251aac23ae8.pdf)
* ImageNet Classification with Deep Convolutional Neural Networks, 2012, [paper](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf) ![getting-started](https://img.shields.io/badge/style-getting_started-yellow.svg?style=flat&label=paper)

## Image Segmentation
* Mask R-CNN, mar 2017, [arxiv](https://arxiv.org/pdf/1703.06870)
* Learning Features by Watching Objects Move, dec 2016, Facebook, [arxiv](https://arxiv.org/pdf/1612.06370.pdf) ![state-of-the-art](https://img.shields.io/badge/style-state_of_the_art-green.svg?style=flat&label=paper)
* Fully Convolutional Networks for Semantic Segmentation, may 2016, [arxiv](https://arxiv.org/pdf/1605.06211v1)
* Instance-aware Semantic Segmentation via Multi-task Network Cascades, dec 2015, [arxiv](https://arxiv.org/pdf/1512.04412.pdf)
* SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, dec 2015, [arxiv](https://arxiv.org/pdf/1511.00561v2.pdf)
* Learning Rich Features from RGB-D Images for Object Detection and Segmentation, jul 2014, [arxiv](https://arxiv.org/pdf/1407.5736v1)

## Pose Estimation
* Fast Single Shot Detection and Pose Estimation, sep 2016, [arxiv](https://arxiv.org/pdf/1609.05590.pdf)

## Face Recognition
* Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition, [paper](https://www.cs.cmu.edu/~sbhagava/papers/face-rec-ccs16.pdf)
* OpenFace: A general-purpose face recognition library with mobile applications, June 2016, [paper](http://reports-archive.adm.cs.cmu.edu/anon/anon/usr0/ftp/2016/CMU-CS-16-118.pdf)
* Deep Face Recognition, 2015, [paper](http://www.robots.ox.ac.uk:5000/~vgg/publications/2015/Parkhi15/parkhi15.pdf)
* Compact Convolutional Neural Network Cascade for Face Detection, aug 2015, [arxiv](https://arxiv.org/pdf/1508.01292v3)
* Learning Robust Deep Face Representation, Jul 2015, [arxiv](https://arxiv.org/pdf/1507.04844.pdf)
* FaceNet: A Unified Embedding for Face Recognition and Clustering, jun 2015, [paper](http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Schroff_FaceNet_A_Unified_2015_CVPR_paper.pdf)
* Multi-view Face Detection Using Deep Convolutional Neural Networks, yahoo, feb 2015, [arxiv](https://arxiv.org/pdf/1502.02766v3.pdf)

## Style Transfer
* Deep Photo Style Transfer, mar 2017, [arxiv](https://arxiv.org/pdf/1703.07511)
* A learned representation for artistic style, okt 2016, Google, [arxiv](https://arxiv.org/pdf/1610.07629v1.pdf), [demo](https://magenta.tensorflow.org/2016/11/01/multistyle-pastiche-generator/)
* Fast Style Transfer in TensorFlow, [github](https://github.com/lengstrom/fast-style-transfer/)
* Instance Normalization: The Missing Ingredient for Fast Stylization, sept 2016, [arxiv](https://arxiv.org/abs/1607.08022)
* A Neural Algorithm of Artistic Style, sept 2015, [arxiv](https://arxiv.org/abs/1508.06576)
* Perceptual Losses for Real-Time Style Transfer and Super-Resolution, [arxiv](https://arxiv.org/pdf/1603.08155v1), [github](http://github.com/jcjohnson/fast-neural-style)

## Logo Recognition
* Deep Learning Logo Detection with Data Expansion by Synthesising Context, jan 2016, [arxiv](https://arxiv.org/pdf/1612.09322.pdf)![state-of-the-art](https://img.shields.io/badge/style-state_of_the_art-green.svg?style=flat&label=paper) 
* Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks, apr 2016, [arxiv](https://arxiv.org/pdf/1604.06083.pdf)
* Logo Localization and Recognition in Natural Images Using Homographic Class Graphs, 2016, [paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.717.9015&rep=rep1&type=pdf)
* LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks, nov 2015, [arxiv](https://arxiv.org/pdf/1511.02462.pdf)
* DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer, okt 2015, Berkely, [arxiv](https://arxiv.org/pdf/1510.02131v1.pdf)
* Automatic detection of logos in video and their removal using inpainting, jul 2015, [paper](https://github.com/sbrugman/deep-learning-papers/tree/master/papers/automatic-detection-of-logos-in-video-and-their-removal-using-inpainting.pdf) 
* On the Benefit of Synthetic Data for Company Logo Detection, 2015, [paper](http://www.multimedia-computing.de/mediawiki/images/c/cf/ACMMM2015.pdf)
* Fast and Robust Realtime Storefront Logo Recognition, [paper](https://stacks.stanford.edu/file/druid:bf950qp8995/Liu.pdf)
* Scalable Logo Recognition in Real-World Images, 2011, [paper](http://www.multimedia-computing.de/mediawiki/images/3/34/ICMR2011_Scalable_Logo_Recognition_in_Real-World_Images.pdf)
* https://arxiv.org/pdf/1609.01414v1.pdf

_note: also includes some papers that use SIFT_

## Text (in the Wild) Recognition
* COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images, jun 2016, [arxiv](https://arxiv.org/pdf/1601.07140v2.pdf)
* Recursive Recurrent Nets with Attention Modeling for OCR in the Wild, mar 2016, [arxiv](https://arxiv.org/pdf/1603.03101v1) ![state-of-the-art](https://img.shields.io/badge/style-state_of_the_art-green.svg?style=flat&label=paper)
* OCR Error Correction Using Character Correction and Feature-Based Word Classification, 2016, [arxiv](https://arxiv.org/pdf/1604.06225.pdf)
* Efficient Scene Text Localization and Recognition with Local Character Refinement, apr 2015, [arxiv](https://arxiv.org/pdf/1504.03522.pdf)
* Reading Text in the Wild with Convolutional Neural Networks, dec 2014, [arxiv](https://arxiv.org/pdf/1412.1842v1.pdf)
* Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition, jun 2014, [arxiv](https://arxiv.org/pdf/1406.2227v4.pdf)
* Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning, 2011, [paper](http://www.cs.stanford.edu/~acoates/papers/coatesetal_icdar_2011.pdf)

## Image / Video Description
* Generation and Comprehension of Unambiguous Object Descriptions, apr 2016, [arxiv](https://arxiv.org/pdf/1511.02283v3.pdf)
* Long-term Recurrent Convolutional Networks for Visual Recognition and Description, may 2016, [arxiv](https://arxiv.org/pdf/1411.4389v4)

## Detect key actor
* Detecting events and key actors in multi-person videos, mar 2015, [arxiv](https://arxiv.org/pdf/1511.02917v2.pdf)

## Self-driving cars
* End to End Learning for Self-Driving Cars, apr 2016, NVIDIA, [arxiv](https://arxiv.org/pdf/1604.07316.pdf)

## ConvNet visualization
* Visualizing and Understanding Convolutional Networks, nov 2013, [arxiv](https://arxiv.org/pdf/1311.2901.pdf)

## Object part detection
* Discovering the physical parts of an articulated object class from multiple videos, 2016, [paper](http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Del_Pero_Discovering_the_Physical_CVPR_2016_paper.pdf)

## Pedestrian Detection
* Joint Deep Learning for Pedestrian Detection, 2013, [paper](http://www.ee.cuhk.edu.hk/~wlouyang/projects/ouyangWiccv13Joint/material/O&W_ICCV2013.pdf)

## Lip Reading
* Lip Reading in the Wild, 2016, Oxford, [paper](https://github.com/sbrugman/deep-learning-papers/tree/master/papers/lip-reading-in-the-wild.pdf)

## Super Resolution
* RAISR: Rapid and Accurate Image Super Resolution, okt 2016, Google, [arxiv](https://arxiv.org/pdf/1606.01299v3.pdf)
* Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, sep 2016, Twitter, [arxiv](https://arxiv.org/pdf/1609.05158v2)

## Image Compression
*  Full Resolution Image Compression with Recurrent Neural Networks, aug 2016, Google [arxiv](https://arxiv.org/pdf/1608.05148v1.pdf)

## Automated Theorem Proving
* DeepMath - Deep Sequence Models for Premise Selection, jun 2016, Google [arxiv](https://arxiv.org/pdf/1606.04442.pdf)

## Reverse Engineering
* Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, okt 2016, [arxiv](https://arxiv.org/pdf/1610.05755v1)
* Stealing Machine Learning Models via Prediction APIs, aug 2016, [paper](https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf)

## Language
* Rationalizing Neural Predictions [github](https://github.com/taolei87/rcnn), [arxiv](https://arxiv.org/pdf/1606.04155v1)

## Translation
* Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, nov 2016, Google, [arxiv](https://arxiv.org/pdf/1611.04558v1.pdf)
* Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, okt 2016, Google [arxiv](https://arxiv.org/pdf/1609.08144v2.pdf)

## Age
*  Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker, dec 2016, [arxiv](https://arxiv.org/pdf/1612.02572v1)