SAR图像目标识别的可解释性问题探讨

郭炜炜 张增辉 郁文贤 孙效华

引用本文:
Citation:

SAR图像目标识别的可解释性问题探讨

    作者简介:
    郭炜炜(1983–),男,江苏南通人,博士,分别于2005,2007,2014年获得国防科技大学信息工程学士,信息与通信专业硕士和博士学位。2008年—2010年在英国Queen Mary,University of London联合培养,2014年12月至2018年6月在上海交通大学电子信息与电气工程学院从事博士后研究工作,2018年12月至今为同济大学设计创意学院助理教授。研究方向为遥感图像理解、模式识别与机器学习、人机交互等。E-mail: weiweiguo@tongji.edu.cn;
    张增辉(1980–),男,山东金乡人,博士,分别于2001年、2003年和2008年在国防科技大学获得应用数学、计算数学、信息与通信工程专业学士、硕士和博士学位。2008年6月至2013年7月,为国防科技大学数学与系统科学系讲师;2014年2月至今,为上海交通大学电子信息与电气工程学院副研究员。研究方向为SAR图像解译、雷达信号处理等。E-mail: zenghui.zhang@sjtu.edu.cn;
    郁文贤(1964–),男,上海松江人,博士,教授,博士生导师,上海交通大学讲席教授,教育部长江学者特聘教授,上海市领军人才。现为上海交通大学信息技术与电气工程研究院院长,北斗导航与位置服务上海市重点实验室主任,智能探测与识别上海市高校重点实验室主任。研究方向为遥感信息处理、多源融合导航定位、目标检测识别等。E-mail: wxyu@sjtu.edu.cn;
    孙效华(1972–),女,河南安阳人,麻省理工学院设计与计算专业硕士与博士,教授,博士生导师,同济大学设计创意学院副院长。曾在MIT CECI、MIT媒体实验室、FXPAL、IBM研究院、美国克拉克森大学等机构从事研究与教学。研究方向为人机智能交互与共融、人-机器人交互HRI、可视分析等。E-mail: xsun@tongji.edu.cn.
    通讯作者: 张增辉 zenghui.zhang@sjtu.edu.cn
  • 基金项目:

    国家自然科学基金联合基金(U1830103)

  • 中图分类号: TN957.51

Perspective on Explainable SAR Target Recognition

    Corresponding author: ZHANG Zenghui, zenghui.zhang@sjtu.edu.cn
  • Fund Project: The National Natural Science Foundation of China(U1830103)

    CLC number: TN957.51

  • 摘要: 合成孔径雷达(SAR)图像目标识别是实现微波视觉的关键技术之一。尽管深度学习技术已被成功应用于解决SAR图像目标识别问题,并显著超越了传统方法的性能,但其内部工作机理不透明、解释性不足,成为制约SAR图像目标识别技术可靠和可信应用的瓶颈。深度学习的可解释性问题是目前人工智能领域的研究热点与难点,对于理解和信任模型决策至关重要。该文首先总结了当前SAR图像目标识别技术的研究进展和所面临的挑战,对目前深度学习可解释性问题的研究进展进行了梳理。在此基础上,从模型理解、模型诊断和模型改进等方面对SAR图像目标识别的可解释性问题进行了探讨。最后,以可解释性研究为切入点,从领域知识结合、人机协同和交互式学习等方面进一步讨论了未来突破SAR图像目标识别技术瓶颈有可能的方向。
  • 图 1  一个简单CNN分类器在T72 SAR图像的梯度-类激活映射(Grad-CAM[6])

    Figure 1.  The results of a CNN classifier and the Grad-CAM map[6]

    图 2  MSTAR T62光学图像与不同方位角下的SAR图像

    Figure 2.  The Optical image and SAR image samples of T62 tank at different azimuth angles in MSTAR dataset

    图 3  SAR目标识别面临的挑战

    Figure 3.  Challenges of SAR ATR

    图 4  可解释性学习

    Figure 4.  Explainable machine learning

    图 5  基于梯度系列方法的决策显著性

    Figure 5.  Decision saliency of the Gradient-based methods

    图 6  LIME示意图[55]

    Figure 6.  Illustration of LIME[55]

    图 7  SAR目标识别可解释性研究

    Figure 7.  Explainable SAR automatic target recognition

    图 8  SAR目标识别的样本决策重要性分析

    Figure 8.  Decision importance analysis for the SAR target recognition model

    图 9  物理知识引导的SAR特征学习网络

    Figure 9.  Physical model guided feature learning for SAR images

    表 1  典型的可解释性方法

    Table 1.  Typical methods for explainablitiy

    解释的对象模型依赖(Model-specific)模型无关(Model-agnostic)
    解释模型
    Explain model
    ■激活最大化方法AM[43,44]
    ■概念激活矢量TCAV[45]
    ■知识蒸馏(Knowledge distilling)[46]
    ■特征置换(Permutation)[47]
    解释样本
    Explain sample
    ■基于梯度的方法Grad[48], GuidedBP[49], IntegratedGrad[50], SmoothGrad[51]
    ■特征扰动分析Perturbation[52]
    ■层次相关传播LRP[53]
    ■类激活映射CAM[54], Grad-CAM[6]
    ■基于局部代理模型的方法,如LIME[55]
    ■基于实例的方法,如Influence function[56],Critic样本方法[57]
    ■基于Shapley值的方法[58]
    下载: 导出CSV
  • [1] 金亚秋. 多模式遥感智能信息与目标识别: 微波视觉的物理智能[J]. 雷达学报, 2019, 8(6): 710–716. doi: 10.12000/JR19083JIN Yaqiu. Multimode remote sensing intelligent information and target recognition: Physical intelligence of microwave vision[J]. Journal of Radars, 2019, 8(6): 710–716. doi: 10.12000/JR19083
    [2] KEYDEL E R, LEE S W, and MOORE J T. MSTAR extended operating conditions: A tutorial[C]. SPIE Volume 2757, Algorithms for Synthetic Aperture Radar Imagery III, Orlando, USA, 1996. doi: 10.1117/12.242059.
    [3] ZHAO Juanping, GUO Weiwei, ZHANG Zenghui, et al. A coupled convolutional neural network for small and densely clustered ship detection in SAR images[J]. Science China Information Sciences, 2019, 62(4): 42301. doi: 10.1007/s11432-017-9405-6
    [4] 杜兰, 王兆成, 王燕, 等. 复杂场景下单通道SAR目标检测及鉴别研究进展综述[J]. 雷达学报, 2020, 9(1): 34–54. doi: 10.12000/JR19104DU Lan, WANG Zhaocheng, WANG Yan, et al. Survey of research progress on target detection and discrimination of single-channel SAR images for complex scenes[J]. Journal of Radars, 2020, 9(1): 34–54. doi: 10.12000/JR19104
    [5] 徐丰, 王海鹏, 金亚秋. 深度学习在SAR目标识别与地物分类中的应用[J]. 雷达学报, 2017, 6(2): 136–148. doi: 10.12000/JR16130XU Feng, WANG Haipeng, and JIN Yaqiu. Deep learning as applied in SAR target recognition and terrain classification[J]. Journal of Radars, 2017, 6(2): 136–148. doi: 10.12000/JR16130
    [6] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2): 336–359. doi: 10.1007/s11263-019-01228-7
    [7] GOODFELLOW I J, SHLENS J, and SZEGEDY C. Explaining and harnessing adversarial examples[C]. 2015 International Conference on Learning Representations, San Diego, USA, 2015.
    [8] 纪守领, 李进锋, 杜天宇, 等. 机器学习模型可解释性方法、应用与安全研究综述[J]. 计算机研究与发展, 2019, 56(10): 2071–2096. doi: 10.7544/issn1000-1239.2019.20190540JI Shouling, LI Jinfeng, DU Tianyu, et al. Survey on techniques, applications and security of machine learning interpretability[J]. Journal of Computer Research and Development, 2019, 56(10): 2071–2096. doi: 10.7544/issn1000-1239.2019.20190540
    [9] 吴飞, 廖彬兵, 韩亚洪. 深度学习的可解释性[J]. 航空兵器, 2019, 26(1): 39–46. doi: 10.12132/ISSN.1673-5048.2018.0065WU Fei, LIAO Binbing, and HAN Yahong. Interpretability for deep learning[J]. Aero Weaponry, 2019, 26(1): 39–46. doi: 10.12132/ISSN.1673-5048.2018.0065
    [10] GUIDOTTI R, MONREALE A, RUGGIERI S, et al. A survey of methods for explaining black box models[J]. ACM Computing Surveys, 2018, 51(5): 93. doi: 10.1145/3236009
    [11] NOVAK L M, OWIRKA G J, and NETISHEN C M. Performance of a high-resolution polarimetric SAR automatic target recognition system[J]. The Lincoln Laboratory Journal, 1993, 6(1): 11–23.
    [12] GAO Gui. Statistical modeling of SAR images: A survey[J]. Sensors, 2010, 10(1): 775–795. doi: 10.3390/s100100775
    [13] 高贵. SAR图像统计建模研究综述[J]. 信号处理, 2009, 25(8): 1270–1278. doi: 10.3969/j.issn.1003-0530.2009.08.019GAO Gui. Review on the statistical modeling of SAR images[J]. Signal Processing, 2009, 25(8): 1270–1278. doi: 10.3969/j.issn.1003-0530.2009.08.019
    [14] 郭炜炜. SAR图像目标分割与特征提取[D]. [硕士论文], 国防科学技术大学, 2007: 28–35.GUO Weiwei. SAR image target segmentation and feature extraction[D]. [Master dissertation], National University of Defense Technology, 2007: 28–35.
    [15] HUAN Ruohong and YANG Ruliang. SAR target recognition based on MRF and gabor wavelet feature extraction[C]. 2008 IEEE International Geoscience and Remote Sensing Symposium, Boston, USA, 2008: II-907–II-910. doi: 10.1109/igarss.2008.4779142.
    [16] PAPSON S and NARAYANAN R M. Classification via the shadow region in SAR imagery[J]. IEEE Transactions on Aerospace and Electronic Systems, 2012, 48(2): 969–980. doi: 10.1109/taes.2012.6178042
    [17] CASASENT D and CHANG W T. Correlation synthetic discriminant functions[J]. Applied Optics, 1986, 25(14): 2343–2350. doi: 10.1364/ao.25.002343
    [18] ZHAO Q and PRINCIPE J C. Support vector machines for SAR automatic target recognition[J]. IEEE Transactions on Aerospace and Electronic Systems, 2001, 37(2): 643–654. doi: 10.1109/7.937475
    [19] SUN Yijun, LIU Zhipeng, TODOROVIC S, et al. Adaptive boosting for SAR automatic target recognition[J]. IEEE Transactions on Aerospace and Electronic Systems, 2007, 43(1): 112–125. doi: 10.1109/taes.2007.357120
    [20] SUN Yongguang, DU Lan, WANG Yan, et al. SAR automatic target recognition based on dictionary learning and joint dynamic sparse representation[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 13(12): 1777–1781. doi: 10.1109/lgrs.2016.2608578
    [21] POTTER L C and MOSES R L. Attributed scattering centers for SAR ATR[J]. IEEE Transactions on Image Processing, 1997, 6(1): 79–91. doi: 10.1109/83.552098
    [22] 计科峰, 匡纲要, 粟毅, 等. 基于SAR图像的目标散射中心特征提取方法研究[J]. 国防科技大学学报, 2003, 25(1): 45–50. doi: 10.3969/j.issn.1001-2486.2003.01.010JI Kefeng, KUANG Gangyao, SU Yi, et al. Research on the extracting method of the scattering center feature from SAR imagery[J]. Journal of National University of Defense Technology, 2003, 25(1): 45–50. doi: 10.3969/j.issn.1001-2486.2003.01.010
    [23] 丁柏圆, 文贡坚, 余连生, 等. 属性散射中心匹配及其在SAR目标识别中的应用[J]. 雷达学报, 2017, 6(2): 157–166. doi: 10.12000/JR16104DING Baiyuan, WEN Gongjian, YU Liansheng, et al. Matching of attributed scattering center and its application to synthetic aperture radar automatic target recognition[J]. Journal of Radars, 2017, 6(2): 157–166. doi: 10.12000/JR16104
    [24] JONES III G and BHANU B. Recognizing articulated objects in SAR images[J]. Pattern Recognition, 2001, 34(2): 469–485. doi: 10.1016/s0031-3203(99)00218-6
    [25] MAO Xiaojiao, SHEN Chunhua, and YANG Yubin. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections[C]. The 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 2810–2818.
    [26] DONG Chao, LOY C C, HE Kaiming, et al. Image super-resolution using deep convolutional networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295–307. doi: 10.1109/tpami.2015.2439281
    [27] LIU Li, OUYANG Wanli, WANG Xiaogang, et al. Deep learning for generic object detection: A survey[J]. International Journal of Computer Vision, 2020, 128(2): 261–318. doi: 10.1007/s11263-019-01247-4
    [28] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 770–778.
    [29] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848. doi: 10.1109/tpami.2017.2699184
    [30] CHEN Sizhe, WANG Haipeng, XU Feng, et al. Target classification using the deep convolutional networks for sar images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(8): 4806–4817. doi: 10.1109/tgrs.2016.2551720
    [31] 潘宗序, 安全智, 张冰尘. 基于深度学习的雷达图像目标识别研究进展[J]. 中国科学: 信息科学, 2019, 49(12): 1626–1639. doi: 10.1360/SSI-2019-0093PAN Zongxu, AN Quanzhi, and ZHANG Bingchen. Progress of deep learning-based target recognition in radar images[J]. Scientia Sinica Informationis, 2019, 49(12): 1626–1639. doi: 10.1360/SSI-2019-0093
    [32] 贺丰收, 何友, 刘准钆, 等. 卷积神经网络在雷达自动目标识别中的研究进展[J]. 电子与信息学报, 2020, 42(1): 119–131. doi: 10.11999/JEIT180899HE Fengshou, HE You, LIU Zhunga, et al. Research and development on applications of convolutional neural networks of radar automatic target recognition[J]. Journal of Electronics and Information Technology, 2020, 42(1): 119–131. doi: 10.11999/JEIT180899
    [33] ZHAO Juanping, ZHANG Zenghui, YU Wenxian, et al. A cascade coupled convolutional neural network guided visual attention method for ship detection from SAR images[J]. IEEE Access, 2018, 6: 50693–50708. doi: 10.1109/access.2018.2869289
    [34] 陈慧元, 刘泽宇, 郭炜炜, 等. 基于级联卷积神经网络的大场景遥感图像舰船目标快速检测方法[J]. 雷达学报, 2019, 8(3): 413–424. doi: 10.12000/JR19041CHEN Huiyuan, LIU Zeyu, GUO Weiwei, et al. Fast detection of ship targets for large-scale remote sensing image based on a cascade convolutional neural network[J]. Journal of Radars, 2019, 8(3): 413–424. doi: 10.12000/JR19041
    [35] WAGNER S. Combination of convolutional feature extraction and support vector machines for radar ATR[C]. The 17th International Conference on Information Fusion (FUSION), Salamanca, Spain, 2014: 1–6.
    [36] WAGNER S A. SAR ATR by a combination of convolutional neural network and support vector machines[J]. IEEE Transactions on Aerospace and Electronic Systems, 2016, 52(6): 2861–2872. doi: 10.1109/taes.2016.160061
    [37] HUANG Zhongling, PAN Zongxu, and LEI Bin. What, where, and how to transfer in SAR target recognition based on deep CNNs[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(4): 2324–2336. doi: 10.1109/tgrs.2019.2947634
    [38] 赵娟萍, 郭炜炜, 柳彬, 等. 基于概率转移卷积神经网络的含噪标记SAR图像分类[J]. 雷达学报, 2017, 6(5): 514–523. doi: 10.12000/JR16140ZHAO Juanping, GUO Weiwei, LIU Bin, et al. Convolutional neural network-based sar image classification with noisy labels[J]. Journal of Radars, 2017, 6(5): 514–523. doi: 10.12000/JR16140
    [39] GUNNING D. EXplainable Artificial Intelligence (XAI)[R]. DARPA/I2O, 2017.
    [40] ADADI A and BERRADA M. Peeking inside the black-box: A survey on EXplainable Artificial Intelligence (XAI)[J]. IEEE Access, 2018, 6: 52138–52160. doi: 10.1109/access.2018.2870052
    [41] LIPTON Z C. The mythos of model interpretability[J]. Communications of the ACM, 2018, 61(10): 36–43. doi: 10.1145/3233231
    [42] ZHANG Quanshi and ZHU Songchun. Visual interpretability for deep learning: A survey[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(1): 27–39. doi: 10.1631/fitee.1700808
    [43] MAHENDRAN A and VEDALDI A. Visualizing deep convolutional neural networks using natural pre-images[J]. International Journal of Computer Vision, 2016, 120(3): 233–255. doi: 10.1007/s11263-016-0911-8
    [44] NGUYEN A, CLUNE J, BENGIO Y, et al. Plug & play generative networks: Conditional iterative generation of images in latent space[J]. arXiv: 1612.00005, 2016.
    [45] KIM B, WATTENBERG M, GILMER J, et al. Interpretability beyond feature attribution: Quantitative Testing with Concept Activation Vectors (TCAV)[J]. arXiv: 1711.11279, 2017.
    [46] FROSST N and HINTON G. Distilling a neural network into a soft decision tree[J]. arXiv: 1711.09784, 2017.
    [47] ALTMANN A, TOLOŞI L, SANDER O, et al. Permutation importance: A corrected feature importance measure[J]. Bioinformatics, 2010, 26(10): 1340–1347. doi: 10.1093/bioinformatics/btq134
    [48] SIMONYAN K, VEDALDI A, and ZISSERMAN A. Deep inside convolutional networks: Visualising image classification models and saliency maps[J]. arXiv: 1312.6034, 2013.
    [49] SPRINGENBERG J T, DOSOVITSKIY A, BROX T, et al. Striving for simplicity: The all convolutional net[J]. arXiv: 1412.6806, 2014.
    [50] SUNDARARAJAN M, TALY A, and YAN Qiqi. Gradients of counterfactuals[J]. arXiv: 1611.02639, 2016.
    [51] SMILKOV D, THORAT N, KIM B, et al. SmoothGrad: Removing noise by adding noise[J]. arXiv: 1706.03825, 2017.
    [52] FONG R, PATRICK M, and VEDALDI A. Understanding deep networks via extremal perturbations and smooth masks[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019: 2950–2958. doi: 10.1109/iccv.2019.00304.
    [53] BACH S, BINDER A, MONTAVON G, et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation[J]. PLoS One, 2015, 10(7): e0130140. doi: 10.1371/journal.pone.0130140
    [54] ZHOU Bolei, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 2921–2929. doi: 10.1109/cvpr.2016.319.
    [55] RIBEIRO M, SINGH S, and GUESTRIN C.“Why should I trust you?”: Explaining the predictions of any classifier[C]. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, San Diego, USA, 2016: 97–101. doi: 10.18653/v1/n16-3020.
    [56] KOH P W and LIANG P. Understanding black-box predictions via influence functions[C]. The 34th International Conference on Machine Learning, Sydney, Australia, 2017: 1885–1894.
    [57] KIM B, KHANNA R, and KOYEJO O. Examples are not enough, learn to criticize! Criticism for Interpretability[C]. The 30th Annual Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 2280–2288.
    [58] LUNDBERG S M and LEE S I. A unified approach to interpreting model predictions[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 4768–4777.
    [59] ZHANG Quanshi, YANG Yu, MA Haotian, et al. Interpreting CNNs via decision trees[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 6254–6263. doi: 10.1109/cvpr.2019.00642.
    [60] DU Mengnan, LIU Ninghao, SONG Qingquan, et al. Towards explanation of DNN-based prediction with guided feature inversion[C]. The 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 2018: 1358–1367.
    [61] ZEILER M D and FERGUS R. Visualizing and understanding convolutional networks[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 818–833.
    [62] SAMEK W, BINDER A, MONTAVON G, et al. Evaluating the visualization of what a deep neural network has learned[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(11): 2660–2673. doi: 10.1109/tnnls.2016.2599820
    [63] NAM W J, GUR S, CHOI J, et al. Relative attributing propagation: Interpreting the comparative contributions of individual units in deep neural networks[C]. The 34th Conference on Artificial Intelligence (AAAI), New York, USA, 2020: 2501–2508.
    [64] RUDIN C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead[J]. Nature Machine Intelligence, 2019, 1(5): 206–215. doi: 10.1038/s42256-019-0048-x
    [65] XU K, BA J L, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]. The 32nd International Conference on Machine Learning(ICML), Lille, France, 2015: 2048–2057.
    [66] GREGOR K and LECUN Y. Learning fast approximations of sparse coding[C]. The 27th International Conference on Machine Learning, Haifa, Israel, 2010: 399–406.
    [67] ZHENG Shuai, JAYASUMANA S, ROMERA-PAREDES B, et al. Conditional random fields as recurrent neural networks[C]. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015: 1529–1537. doi: 10.1109/iccv.2015.179.
    [68] PENG Xi, TSANG I W, ZHOU J T, et al. K-meansNet: When k-means meets differentiable programming[J]. arxiv: 1808.07292, 2018.
    [69] ZHU Hongyuan, PENG Xi, Chandrasekhar V, et al. DehazeGAN: When image dehazing meets differential programming[C]. The 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018: 1234–1240.
    [70] KARPATNE A, WATKINS W, READ J, et al. Physics-guided Neural Networks (PGNN): An application in lake temperature modeling[J]. arxiv: 1710.11431, 2017.
    [71] CHEN Tianshui, XU Muxin, HUI Xiaolu, et al. Learning semantic- specific graph representation for multi-label image recognition[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 522–531.
    [72] CHU Lingyang, HU Xia, HU Juhua, et al. Exact and consistent interpretation for piecewise linear neural networks: A closed form solution[C]. The 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 2018: 1244–1253.
    [73] BAU D, ZHOU Bolei, KHOSL A, et al. Network dissection: Quantifying interpretability of deep visual representations[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 3319–3327. doi: 10.1109/cvpr.2017.354.
    [74] DATCU M, ANDREI V, DUMITRU C O, et al. Explainable deep learning for SAR data[C]. Φ-week, Frascati, Italy, 2019.
    [75] HUANG Zhongling, DATCU M, PAN Zongxu, et al. Deep SAR-Net: Learning objects from signals[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 161: 179–193. doi: 10.1016/j.isprsjprs.2020.01.016
    [76] ZHAO Juanping, DATCU M, ZHANG Zenghui, et al. Contrastive-regulated CNN in the complex domain: A method to learn physical scattering signatures from flexible PolSAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(12): 10116–10135. doi: 10.1109/tgrs.2019.2931620
    [77] CHEN Lifu, TAN Siyu, PAN Zhouhao, et al. A new framework for automatic airports extraction from SAR images using multi-level dual attention mechanism[J]. Remote Sensing, 2020, 12(3): 560. doi: 10.3390/rs12030560
    [78] LI Chen, DU Lan, DENG Sheng, et al. Point-wise discriminative auto-encoder with application on robust radar automatic target recognition[J]. Signal Processing, 2020, 169: 107385. doi: 10.1016/j.sigpro.2019.107385
    [79] CETIN M, KARL W C, and CASTANON D A. Feature enhancement and ATR performance using nonquadratic optimization-based SAR imaging[J]. IEEE Transactions on Aerospace and Electronic Systems, 2003, 39(4): 1375–1395. doi: 10.1109/taes.2003.1261134
    [80] KHANNA R, KIM B, GHOSH J, et al. Interpreting black box predictions using fisher kernels[C]. The 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), Okinawa, Japan, 2019: 3382–3390.
  • [1] 徐丰王海鹏金亚秋 . 深度学习在SAR目标识别与地物分类中的应用. 雷达学报, 2017, 6(2): 136-148. doi: 10.12000/JR16130
    [2] 赵飞翔刘永祥霍凯 . 一种基于Dropout约束深度极限学习机的雷达目标分类算法. 雷达学报, 2018, 7(5): 613-621. doi: 10.12000/JR18048
    [3] 王俊郑彤雷鹏魏少明 . 深度学习在雷达中的研究综述. 雷达学报, 2018, 7(4): 395-411. doi: 10.12000/JR18040
    [4] 周雨王海鹏陈思喆 . 基于数值散射模拟与模型匹配的SAR自动目标识别研究. 雷达学报, 2015, 4(6): 666-673. doi: 10.12000/JR15080
    [5] 罗迎倪嘉成张群 . 基于“数据驱动+智能学习”的合成孔径雷达学习成像. 雷达学报, 2020, 9(1): 107-122. doi: 10.12000/JR19103
    [6] 田壮壮占荣辉胡杰民张军 . 基于卷积神经网络的SAR图像目标识别研究. 雷达学报, 2016, 5(3): 320-325. doi: 10.12000/JR16037
    [7] 喻玲娟王亚东谢晓春林赟洪文 . 基于FCNN和ICAE的SAR图像目标识别方法. 雷达学报, 2018, 7(5): 622-631. doi: 10.12000/JR18066
    [8] 金亚秋 . 多模式遥感智能信息与目标识别:微波视觉的物理智能. 雷达学报, 2019, 8(6): 710-716. doi: 10.12000/JR19083
    [9] 韩萍王欢 . 基于改进的稀疏保持投影的SAR目标特征提取与识别. 雷达学报, 2015, 4(6): 674-680. doi: 10.12000/JR15068
    [10] 郭倩王海鹏徐丰 . SAR图像飞机目标检测识别进展. 雷达学报, 2020, 9(3): 497-513. doi: 10.12000/JR20020
    [11] 张金松邢孟道孙光才 . 一种基于密集深度分离卷积的SAR图像水域分割算法. 雷达学报, 2019, 8(3): 400-412. doi: 10.12000/JR19008
    [12] 赵飞翔刘永祥霍凯 . 基于栈式降噪稀疏自动编码器的雷达目标识别方法. 雷达学报, 2017, 6(2): 149-156. doi: 10.12000/JR16151
    [13] 陈小龙陈唯实饶云华黄勇关键董云龙 . 飞鸟与无人机目标雷达探测与识别技术进展与展望. 雷达学报, 2020, 9(): 1-25. doi: 10.12000/JR20068
    [14] 卫扬铠曾涛陈新亮丁泽刚范宇杰温育涵 . 典型线面目标合成孔径雷达参数化成像. 雷达学报, 2020, 9(1): 143-153. doi: 10.12000/JR19077
    [15] 文贡坚朱国强殷红成邢孟道杨虎马聪慧闫华丁柏圆钟金荣 . 基于三维电磁散射参数化模型的SAR目标识别方法. 雷达学报, 2017, 6(2): 115-135. doi: 10.12000/JR17034
    [16] 苏宁远陈小龙关键牟效乾刘宁波 . 基于卷积神经网络的海上微动目标检测与分类方法. 雷达学报, 2018, 7(5): 565-574. doi: 10.12000/JR18077
    [17] 陈慧元刘泽宇郭炜炜张增辉郁文贤 . 基于级联卷积神经网络的大场景遥感图像舰船目标快速检测方法. 雷达学报, 2019, 8(3): 413-424. doi: 10.12000/JR19041
    [18] 黄岩赵博陶明亮陈展野洪伟 . 合成孔径雷达抗干扰技术综述. 雷达学报, 2020, 9(1): 86-106. doi: 10.12000/JR19113
    [19] 邢孟道林浩陈溅来孙光才严棒棒 . 多平台合成孔径雷达成像算法综述. 雷达学报, 2019, 8(6): 732-757. doi: 10.12000/JR19102
    [20] 金添 . 叶簇穿透合成孔径雷达增强成像方法. 雷达学报, 2015, 4(5): 503-508. doi: 10.12000/JR15114
  • 加载中
图(9)表(1)
计量
  • 文章访问数:  501
  • HTML浏览量:  403
  • PDF下载量:  108
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-05-11
  • 录用日期:  2020-06-17
  • 网络出版日期:  2020-06-30
  • 刊出日期:  2020-06-28

SAR图像目标识别的可解释性问题探讨

    通讯作者: 张增辉 zenghui.zhang@sjtu.edu.cn
    作者简介:
    郭炜炜(1983–),男,江苏南通人,博士,分别于2005,2007,2014年获得国防科技大学信息工程学士,信息与通信专业硕士和博士学位。2008年—2010年在英国Queen Mary,University of London联合培养,2014年12月至2018年6月在上海交通大学电子信息与电气工程学院从事博士后研究工作,2018年12月至今为同济大学设计创意学院助理教授。研究方向为遥感图像理解、模式识别与机器学习、人机交互等。E-mail: weiweiguo@tongji.edu.cn;
    张增辉(1980–),男,山东金乡人,博士,分别于2001年、2003年和2008年在国防科技大学获得应用数学、计算数学、信息与通信工程专业学士、硕士和博士学位。2008年6月至2013年7月,为国防科技大学数学与系统科学系讲师;2014年2月至今,为上海交通大学电子信息与电气工程学院副研究员。研究方向为SAR图像解译、雷达信号处理等。E-mail: zenghui.zhang@sjtu.edu.cn;
    郁文贤(1964–),男,上海松江人,博士,教授,博士生导师,上海交通大学讲席教授,教育部长江学者特聘教授,上海市领军人才。现为上海交通大学信息技术与电气工程研究院院长,北斗导航与位置服务上海市重点实验室主任,智能探测与识别上海市高校重点实验室主任。研究方向为遥感信息处理、多源融合导航定位、目标检测识别等。E-mail: wxyu@sjtu.edu.cn;
    孙效华(1972–),女,河南安阳人,麻省理工学院设计与计算专业硕士与博士,教授,博士生导师,同济大学设计创意学院副院长。曾在MIT CECI、MIT媒体实验室、FXPAL、IBM研究院、美国克拉克森大学等机构从事研究与教学。研究方向为人机智能交互与共融、人-机器人交互HRI、可视分析等。E-mail: xsun@tongji.edu.cn
  • 1. 同济大学数字创新中心 上海 200092
  • 2. 上海交通大学智能探测与识别上海市重点实验室 上海 200240
基金项目:  国家自然科学基金联合基金(U1830103)

摘要: 合成孔径雷达(SAR)图像目标识别是实现微波视觉的关键技术之一。尽管深度学习技术已被成功应用于解决SAR图像目标识别问题,并显著超越了传统方法的性能,但其内部工作机理不透明、解释性不足,成为制约SAR图像目标识别技术可靠和可信应用的瓶颈。深度学习的可解释性问题是目前人工智能领域的研究热点与难点,对于理解和信任模型决策至关重要。该文首先总结了当前SAR图像目标识别技术的研究进展和所面临的挑战,对目前深度学习可解释性问题的研究进展进行了梳理。在此基础上,从模型理解、模型诊断和模型改进等方面对SAR图像目标识别的可解释性问题进行了探讨。最后,以可解释性研究为切入点,从领域知识结合、人机协同和交互式学习等方面进一步讨论了未来突破SAR图像目标识别技术瓶颈有可能的方向。

English Abstract

    • 合成孔径雷达(SAR)是一种可实现高分辨率的微波主动成像雷达,具备全天时、全天候、大范围观测成像的能力,使其在国民经济和国防军事等领域的应用中具有独特的优势,甚至是极端气象条件下唯一可靠的观测数据来源。SAR图像自动目标识别(Automatic Target Recognition, ATR)是实现SAR图像智能解译的关键技术之一[1],自上个世纪50年代SAR诞生以来至今持续获得大量的关注和研究[2]。特别是近年来随着深度学习技术的迅猛发展,深度神经网络也被应用于解决SAR图像目标检测和识别问题,并大幅超越了传统SAR图像目标检测识别技术[3-5]。尽管深度学习技术显著提升了SAR图像目标检测识别的性能,但主要依赖于大量标注数据的参数拟合能力,其内部过程犹如黑盒子,人们很难理解其背后的工作机理和决策逻辑,难以掌握系统决策行为的边界。如图1,笔者采用一个简单的具有5层卷积模块(Conv2d-ReLU-MaxPool2d)的卷积神经网络(Convolutional Neural Network, CNN)在MSTAR[2]测试集上的识别准确率可以达到93.80%(图1(d)),针对图1(a)输入样本能够正确判断其类别(图1(c)),但是基于Grad-CAM[6](Gradient-Class Activation Mapping)方法提取的决策显著性区域(图1(b))显示决策并不完全依赖于目标区域,还有部分背景区域对最终决策也有重要影响,其背后的决策合理性还需要结合SAR机理和特性进行分析和评估。

      图  1  一个简单CNN分类器在T72 SAR图像的梯度-类激活映射(Grad-CAM[6])

      Figure 1.  The results of a CNN classifier and the Grad-CAM map[6]

      一方面,这样决策不透明和缺乏可解释性的SAR目标识别技术在军事目标侦察、精确打击等高风险应用中隐藏着一定的决策风险,在应用中难以取得用户的信任;另一方面,SAR图像是目标电磁散射特性的反映,难以被视觉所认知,深度神经网络从大量数据中自动挖掘的特征表示有可能蕴含一些新的知识,通过对这些特征的理解,可以启发人们反过来利用这些知识,进而提升SAR目标认知解译的能力;再次,深度神经网络工作机理复杂,且具有一定的脆弱性[7],需要通过理解深层网络模型背后的决策过程和依据,发现其中的缺陷,以便对模型和算法加以改进,提升SAR目标识别系统的鲁棒性;进一步地,SAR图像与光学图像特性存在着本质差异,其对成像参数高度敏感,很难获取完备的训练样本,因此在构建SAR图像目标识别的深层模型时需要考虑SAR图像数据的特点,结合SAR本身的物理、统计和语义等领域知识,建立可解释的SAR图像目标识别模型,从而增强SAR图像目标识别的可解释性、鲁棒性和在小样本上的泛化能力。

      可解释性是人与决策模型之间的接口,旨在对模型的决策给出令人能够理解的清晰概括和指示,从而帮助人们理解模型从数据中学到了什么,针对每一个样本是如何决策的,决策是否合理和可靠等[8-10]。SAR的电磁成像机理与人类视觉系统和光学遥感的成像机理有着本质差异,导致对SAR图像的认知理解与解译应用非常困难。例如图2,SAR系统接收的是组成地物目标的每一个独立单元形成的散射能量,呈现在SAR图像上的地物目标是散射单元构成的集合体,多表现为离散的点、线组合。SAR系统独特的成像方式会造成相干斑、结构缺失、几何畸变(透视收缩、叠掩)、阴影等现象,导致SAR图像在视觉特性上与光学图像有着明显差异,表现为“所见非所知”的特点,同时SAR图像对观测参数敏感、获取样本困难,导致SAR图像目标识别仍是一个世界性难题。本文在总结当前SAR图像目标识别技术及其存在问题的基础上,结合当前机器学习、深度学习可解释性的研究进展,从模型理解、模型诊断和模型改进等方面对SAR图像目标识别的可解释性问题进行了探讨,以突破当前SAR目标识别的技术瓶颈和应用限制。最后,本文还从领域知识的引入与结合、人机协同、交互式学习等方面对SAR目标识别未来可能的研究工作进行了讨论,以期推动SAR目标识别技术的进一步发展。

      图  2  MSTAR T62光学图像与不同方位角下的SAR图像

      Figure 2.  The Optical image and SAR image samples of T62 tank at different azimuth angles in MSTAR dataset

    • SAR图像目标解译一般采用“检测->鉴别->识别”的处理流程[11]。SAR图像目标检测和鉴别的主要目的是定位目标在图像中的位置和区域,为进一步的目标识别奠定基础,杜兰教授等人在文献[4]中对目前SAR目标检测及鉴别