目標

難度

  1. 目標遮擋(Occlusion)導致部分特徵丟失
  2. 不同的 View,Illumination 導致同一目標的特徵差異
  3. 不同目標衣服顏色近似、特徵近似導致區分度下降

解決方案

1. Representation learning + ReID

看做分類(Classification/Identification)問題或者驗證(Verification)問題:
(1) 分類問題是指利用行人的ID或者屬性等作為訓練標籤來訓練模型;
(2) 驗證問題是指輸入一對(兩張)行人圖片,讓網絡來學習這兩張圖片是否屬於同一個行人。

Classification/Identification loss和verification loss

額外改進方向[2]是在加上許多行人的label,像是性別、頭髮以及服裝等等。

2. Metric learning + ReID

常用於圖像檢索的方法,通過網絡學習出兩張圖片的相似度。
(Contrastive loss)[5]、三元組損失(Triplet loss)、 四元組損失(Quadruplet loss)、難樣本採樣三元組損失(Triplet hard loss with batch hard mining, TriHard loss)、邊界挖掘損失(Margin sample mining loss, MSML

Contrastive loss 基本上就是Siamese CNN

訓練時是三個正樣本一個副樣本,test時未知


3. Local Feature + ReID

論文[3]用local feature而不用global feature,切割好以後送到LSTM去學

但論文[3]會有對齊問題,所以論文[4]用pose跟skeleton來做姿勢預測,再通過仿射變換對齊

論文[5]直接拿關節點切出ROI,14個人體關節點,得到7個ROI區域,(頭、上身、下身和四肢)

4. Video Sequence + ReID

這方向不熟 貼兩張圖參考參考而已


5. GAN + ReID

ReID數據集目前最大的也只有幾千個ID,跟萬張圖片而已,CNN based還容易overfitting
GAN主要是用在遷移學習跟基於條件的生成

第一篇就是ICCV2017的論文[5]以及後來同作者改進的論文[6],是可以避免overfitting但生成效果就很慘

為了處理不同數據集,甚至是不同camera所造成bias的問題,論文[7]是利用cycleGAN based的設計,利用遷移學習來處理兩個數同數據集的問題,先切割分前景跟背景,在轉換過去。
D有兩個loss(還是有兩個D不確定,paper內沒架構圖)一個是前景的絕對誤差loss,一個是正常的判別器loss。判別器loss是用來判斷生成的圖屬於哪個domain,前景的loss是為了保證行人前景儘可能逼真不變。mask用PSPnet來找的。

Pose Normalization[8]



資料種類

  • Video-based
  • Image-based
  • Long-term activity
  • Individual action

資料庫

Robust Systems Lab

程式碼

简单行人重识别代码到88%准确率
https://github.com/layumi/Person_reID_baseline_pytorch

Paper List

- Point to Set Similarity Based Deep Feature Learning for Person Re-Identification
- Fast Person Re-Identification via Cross-Camera Semantic Binary Transformation
- See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-Identification
- Learning Deep Context-Aware Features Over Body and Latent Parts for Person Re-Identification
- Consistent-Aware Deep Learning for Person Re-Identification in a Camera Network
- Re-Ranking Person Re-Identification With k-Reciprocal Encoding
- Multiple People Tracking by Lifted Multicut and Person Re-Identification

[1] Mengyue Geng, Yaowei Wang, Tao Xiang, Yonghong Tian. Deep transfer learning for person reidentification[J]. arXiv preprint arXiv:1611.05244, 2016.

[2] Yutian Lin, Liang Zheng, Zhedong Zheng, YuWu, Yi Yang. Improving person re-identification by attribute and identity learning[J]. arXiv preprint arXiv:1703.07220, 2017.

[3] Rahul Rama Varior, Bing Shuai, Jiwen Lu, Dong Xu, Gang Wang. A siamese long short-term memory architecture for human re-identification[C]//European Conference on Computer Vision. Springer, 2016:135–153.

[4]Liang Zheng, Yujia Huang, Huchuan Lu, Yi Yang. Pose invariant embedding for deep person reidentification[J]. arXiv preprint arXiv:1701.07732, 2017.

[5] Haiyu Zhao, Maoqing Tian, Shuyang Sun, Jing Shao, Junjie Yan, Shuai Yi, Xiaogang Wang, Xiaoou Tang. Spindle net: Person re-identification with human body region guided feature decomposition and fusion[C]. CVPR, 2017.

[6] Zhong Z, Zheng L, Zheng Z, et al. Camera Style Adaptation for Person Re-identification[J]. arXiv preprint arXiv:1711.10295, 2017.

[7] Wei L, Zhang S, Gao W, et al. Person Transfer GAN to Bridge Domain Gap for Person Re-Identification[J]. arXiv preprint arXiv:1711.08565, 2017.

[8] Qian X, Fu Y, Wang W, et al. Pose-Normalized Image Generation for Person Re-identification[J]. arXiv preprint arXiv:1712.02225, 2017.