Deep Learning and Computer Vision Recommended Paper

Image Classification

Must Read : LeNet, AlexNet, VGG-16, GoogleNet, ResNet

Title	Authors	Pub.	Links	Figure
LeNet-5, convolutional neural networks	Y. LeCun	??? 199X	Web	LeNet
ImageNet Classification with Deep Convolutional Neural Networks	Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton	NIPS 2014	paper	AlexNet
Very Deep Convolutional Networks for Large-Scale Image Recognition	Karen Simonyan, Andrew Zisserman	ICLR 2014	paper	VGG16
Going Deeper with Convolutions	Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed	CVPR 2015	paper	GoogLeNet
Deep Residual Learning for Image Recognition	Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun	CVPR 2016 `best`	paper github	ResNet
Residual Attention Network for Image Classification	Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang	CVPR 2017	paper github	Res-Attention-Network
Aggregated Residual Transformations for Deep Neural Networks	Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He	CVPR 2017	paper github	ResNeXt
Densely Connected Convolutional Networks	Gao Huang, Zhuang Liu, Kilian Q. Weinberger	CVPR 2017 `best`	paper github	DenseNet
Deep Pyramidal Residual Networks	Dongyoon Han, Jiwhan Kim, Junmo Kim	CVPR 2017	paper github	PyramidNet

Object Detection

Must Read : R-CNN, Fast R-CNN, Faster R-CNN, YOLO, SSD

Title	Authors	Pub.	Links	Figure
Rich feature hierarchies for accurate object detection and semantic segmentation	Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik	CVPR 2014	paper github	R-CNN
Fast R-CNN	Ross Girshick	ICCV 2015	paper github	Fast-R-CNN
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition	Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun	TPAMI 2015	paper	SPP Net
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks	Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun	NIPS 2015	paper `matlab` `python` `pytorch`	Faster-R-CNN
You Only Look Once: Unified, Real-Time Object Detection	Joseph Redmon,Santosh Divvala,Ross Girshick, Ali Farhadi	CVPR 2016	paper	YOLO
SSD: Single Shot MultiBox Detector	Wei Liu1, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg	CVPR 2016	paper	SSD
Convolutional Feature Masking for Joint Object and Stuff Segmentation	Jifeng Dai, Kaiming He, Jian Sun	CVPR 2015	paper	CFM
Instance-aware Semantic Segmentation via Multi-task Network Cascades	Jifeng Dai, Kaiming He, Jian Sun	CVPR 2016	paper github	MNC
R-FCN: Object Detection via Region-based Fully Convolutional Networks	Jifeng Dai, Yi Li, Kaiming He, Jian Sun	NIPS 2016	paper github	Region-FCN
Feature Pyramid Networks for Object Detection	Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie	CVPR 2017	paper	FPN
Mask R-CNN	Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick	ICCV 2017	paper	Mask-R-CNN
A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection	Xiaolong Wang, Abhinav Shrivastava, Abhinav Gupta	CVPR 2017	paper github	A-Fast-R-CNN
Multiple Instance Detection Network with Online Instance Classifier Refinement	Peng Tang, Xinggang Wang, Xiang Bai, Wenyu Liu	CVPR 2017	paper	MIDN
R-FCN-3000 at 30fps: Decoupling Detection and Classification	Bharat Singh, Hengdou Li, Abhishek Sharma and Larry S. Davis	Tech Report	paper	R-FCN-3000

Semantic Segmentation and Scene Parsing

Must Read : FCN, Learning Deconvolution Network for Semantic Segmentation, U-Net

Title	Authors	Pub.	Links	Figure
Fully Convolutional Networks for Semantic Segmentation	Jonathan Long, Evan Shelhamer, Trevor Darrell	CVPR 2015	paper	FCN
Learning to Segment Object Candidates	Pedro O. Pinheiro, Ronan Collobert, Piotr Dollar	NIPS 2015	paper	LSOC
Learning to Refine Object Segments	Pedro O. Pinheiro , Tsung-Yi Lin , Ronan Collobert, Piotr Doll ́ar	arXiv 1603.08695	paper	LROS
Conditional Random Fields as Recurrent Neural Networks	Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, ZhiZhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr	ICCV 2015	paper	CRFRNN
Learning Deconvolution Network for Semantic Segmentation	Heonwoo Noh, Seunghoon Hong, Bohyung Han	ICCV 2015	paper	LDN
U-Net: Convolutional Networks for Biomedical Image Segmentation	Olaf Ronneberger, Philipp Fischer, Thomas Brox	MICCAI 2015	paper	U-Net
Instance-sensitive Fully Convolutional Networks	Jifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, Jian Sun	ECCV 2016	paper	ISFCN
Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation	Golnaz Ghiasi, Charless C. Fowlkes	ECCV 2016	paper github	LPRR
Attention to Scale: Scale-aware Semantic Image Segmentation	Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu	CVPR 2016	paper `DeepLab`	Attention-to-scale
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation	Guosheng Lin, Anton Milan, Chunhua Shen, Ian Reid	CVPR 2017	paper github	RefineNet

Pyramid Scene Parsing Network	Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia	CVPR 2017	paper github	PSPNet
ICNet for Real-Time Semantic Segmentation on High-Resolution Images	Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia	Tech Report	paper	ICNet
Dilated Residual Networks	Fisher Yu, Vladlen Koltun, Thomas Funkhouser	CVPR 2017	paper github	DRN
Fully Convolutional Instance-aware Semantic Segmentation	Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, Yichen Wei	CVPR 2017	paper github	FCIS
Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes	Tobias Pohlen, Alexander Hermans, Markus Mathias, Bastian Leibe	CVPR 2017	paper github	FRRN
Object Region Mining with Adversarial Erasing: A Simple Classification toSemantic Segmentation Approach	Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan	CVPR 2017	paper	A-Erasing
Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade	Xiaoxiao Li, Ziwei Liu, Ping Luo, Chen Change Loy, Xiaoou Tang	CVPR 2017	paper	Not-All-Pixels-Are-Equal
Semantic Segmentation with Reverse Attention	Qin Huang, Chunyang Xia, Wuchi Hao, Siyang Li, Ye Wang, Yuhang Song and C.-C. Jay Kuo	BMVC 2017	paper `code`	Rev-Attention
Predicting Deeper into the Future of Semantic Segmentation	Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek and Yann LeCun	ICCV 2017	paper `project page`	Deeper-into-Future
Learning to Segment Every Thing	Ronghang Hu, Piotr Dollar, Kaiming He, Trevor Darrell, Ross Girshick	Tech Report	paper	Seg-Everything

Regularization

Dropout- A Simple Way to Prevent Neural Networks from Overfitting
Batch Normalization- Accelerating Deep Network Training by Reducing Internal Covariate Shift

RNN

Generating Sequences With Recurrent Neural Networks
Word embedding
Distributed Representations of Words and Phrases and their Compositionality

Image captioning

Show and Tell: A Neural Image Caption Generator
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention