Image Classification
Must Read : LeNet, AlexNet, VGG-16, GoogleNet, ResNet
| Title | Authors | Pub. | Links | Figure |
|---|---|---|---|---|
| LeNet-5, convolutional neural networks | Y. LeCun | ??? 199X | Web |
LeNet
|
| ImageNet Classification with Deep Convolutional Neural Networks | Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton | NIPS 2014 | paper |
AlexNet
|
| Very Deep Convolutional Networks for Large-Scale Image Recognition | Karen Simonyan, Andrew Zisserman | ICLR 2014 | paper |
VGG16
|
| Going Deeper with Convolutions | Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed | CVPR 2015 | paper |
GoogLeNet
|
| Deep Residual Learning for Image Recognition | Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun | CVPR 2016 best |
paper github |
ResNet
|
| Residual Attention Network for Image Classification | Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang | CVPR 2017 | paper github |
Res-Attention-Network
|
| Aggregated Residual Transformations for Deep Neural Networks | Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He | CVPR 2017 | paper github |
ResNeXt
|
| Densely Connected Convolutional Networks | Gao Huang, Zhuang Liu, Kilian Q. Weinberger | CVPR 2017 best |
paper github |
DenseNet
|
| Deep Pyramidal Residual Networks | Dongyoon Han, Jiwhan Kim, Junmo Kim | CVPR 2017 | paper github |
PyramidNet
|
Object Detection
Must Read : R-CNN, Fast R-CNN, Faster R-CNN, YOLO, SSD
| Title | Authors | Pub. | Links | Figure |
|---|---|---|---|---|
| Rich feature hierarchies for accurate object detection and semantic segmentation | Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik | CVPR 2014 | paper github |
R-CNN
|
| Fast R-CNN | Ross Girshick | ICCV 2015 | paper github |
Fast-R-CNN
|
| Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition | Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun | TPAMI 2015 | paper |
SPP Net
|
| Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun | NIPS 2015 | paper matlab python pytorch |
Faster-R-CNN
|
| You Only Look Once: Unified, Real-Time Object Detection | Joseph Redmon,Santosh Divvala,Ross Girshick, Ali Farhadi | CVPR 2016 | paper |
YOLO
|
| SSD: Single Shot MultiBox Detector | Wei Liu1, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg | CVPR 2016 | paper |
SSD
|
| Convolutional Feature Masking for Joint Object and Stuff Segmentation | Jifeng Dai, Kaiming He, Jian Sun | CVPR 2015 | paper |
CFM
|
| Instance-aware Semantic Segmentation via Multi-task Network Cascades | Jifeng Dai, Kaiming He, Jian Sun | CVPR 2016 | paper github |
MNC
|
| R-FCN: Object Detection via Region-based Fully Convolutional Networks | Jifeng Dai, Yi Li, Kaiming He, Jian Sun | NIPS 2016 | paper github |
Region-FCN
|
| Feature Pyramid Networks for Object Detection | Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie | CVPR 2017 | paper |
FPN
|
| Mask R-CNN | Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick | ICCV 2017 | paper |
Mask-R-CNN
|
| A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection | Xiaolong Wang, Abhinav Shrivastava, Abhinav Gupta | CVPR 2017 | paper github |
A-Fast-R-CNN
|
| Multiple Instance Detection Network with Online Instance Classifier Refinement | Peng Tang, Xinggang Wang, Xiang Bai, Wenyu Liu | CVPR 2017 | paper |
MIDN
|
| R-FCN-3000 at 30fps: Decoupling Detection and Classification | Bharat Singh, Hengdou Li, Abhishek Sharma and Larry S. Davis | Tech Report | paper |
R-FCN-3000
|
Semantic Segmentation and Scene Parsing
Must Read : FCN, Learning Deconvolution Network for Semantic Segmentation, U-Net
| Title | Authors | Pub. | Links | Figure |
|---|---|---|---|---|
| Fully Convolutional Networks for Semantic Segmentation | Jonathan Long, Evan Shelhamer, Trevor Darrell | CVPR 2015 | paper |
FCN
|
| Learning to Segment Object Candidates | Pedro O. Pinheiro, Ronan Collobert, Piotr Dollar | NIPS 2015 | paper |
LSOC
|
| Learning to Refine Object Segments | Pedro O. Pinheiro , Tsung-Yi Lin , Ronan Collobert, Piotr Doll ́ar | arXiv 1603.08695 | paper |
LROS
|
| Conditional Random Fields as Recurrent Neural Networks | Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, ZhiZhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr | ICCV 2015 | paper |
CRFRNN
|
| Learning Deconvolution Network for Semantic Segmentation | Heonwoo Noh, Seunghoon Hong, Bohyung Han | ICCV 2015 | paper |
LDN
|
| U-Net: Convolutional Networks for Biomedical Image Segmentation | Olaf Ronneberger, Philipp Fischer, Thomas Brox | MICCAI 2015 | paper |
U-Net
|
| Instance-sensitive Fully Convolutional Networks | Jifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, Jian Sun | ECCV 2016 | paper |
ISFCN
|
| Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation | Golnaz Ghiasi, Charless C. Fowlkes | ECCV 2016 | paper github |
LPRR
|
| Attention to Scale: Scale-aware Semantic Image Segmentation | Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu | CVPR 2016 | paper DeepLab |
Attention-to-scale
|
| RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation | Guosheng Lin, Anton Milan, Chunhua Shen, Ian Reid | CVPR 2017 | paper github |
RefineNet
|
| Pyramid Scene Parsing Network | Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia | CVPR 2017 | paper github |
PSPNet
|
| ICNet for Real-Time Semantic Segmentation on High-Resolution Images | Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia | Tech Report | paper |
ICNet
|
| Dilated Residual Networks | Fisher Yu, Vladlen Koltun, Thomas Funkhouser | CVPR 2017 | paper github |
DRN
|
| Fully Convolutional Instance-aware Semantic Segmentation | Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, Yichen Wei | CVPR 2017 | paper github |
FCIS
|
| Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes | Tobias Pohlen, Alexander Hermans, Markus Mathias, Bastian Leibe | CVPR 2017 | paper github |
FRRN
|
| Object Region Mining with Adversarial Erasing: A Simple Classification toSemantic Segmentation Approach | Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan | CVPR 2017 | paper |
A-Erasing
|
| Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade | Xiaoxiao Li, Ziwei Liu, Ping Luo, Chen Change Loy, Xiaoou Tang | CVPR 2017 | paper | Not-All-Pixels-Are-Equal
|
| Semantic Segmentation with Reverse Attention | Qin Huang, Chunyang Xia, Wuchi Hao, Siyang Li, Ye Wang, Yuhang Song and C.-C. Jay Kuo | BMVC 2017 | paper code |
Rev-Attention
|
| Predicting Deeper into the Future of Semantic Segmentation | Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek and Yann LeCun | ICCV 2017 | paper project page |
Deeper-into-Future
|
| Learning to Segment Every Thing | Ronghang Hu, Piotr Dollar, Kaiming He, Trevor Darrell, Ross Girshick | Tech Report | paper |
Seg-Everything
|
Regularization
- Dropout- A Simple Way to Prevent Neural Networks from Overfitting
- Batch Normalization- Accelerating Deep Network Training by Reducing Internal Covariate Shift
RNN
- Generating Sequences With Recurrent Neural Networks
- Word embedding
- Distributed Representations of Words and Phrases and their Compositionality
Image captioning
Show and Tell: A Neural Image Caption Generator
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention