AIR-SARShip-1.0: High-resolution SAR Ship Detection Dataset

SUN Xian WANG Zhirui SUN Yuanrui DIAO Wenhui ZHANG Yue FU Kun

SUN Xian, WANG Zhirui, SUN Yuanrui, et al. AIR-SARShip-1.0: High-resolution SAR ship detection dataset[J]. Journal of Radars, 2019, 8(6): 852–862. DOI:  10.12000/JR19097
Citation: SUN Xian, WANG Zhirui, SUN Yuanrui, et al. AIR-SARShip-1.0: High-resolution SAR ship detection dataset[J]. Journal of Radars, 2019, 8(6): 852–862. DOI:  10.12000/JR19097
doi: 10.12000/JR19097
详细信息
  • 中图分类号: TN957.51; TN958

AIR-SARShip-1.0: High-resolution SAR Ship Detection Dataset

(English)

Funds: The National Natural Science Foundation of China (61725105, 41801349, 41701508), National Major Project on High Resolution Earth Observation System (GFZX0404120405)
More Information
    Author Bio:

    SUN Xian was born in 1981. He is a researcher and doctoral supervisor at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research fields are computer vision and remote sensing image interpretation. E-mail: sunxian@mail.ie.ac.cn

    WANG Zhirui was born in 1990. He received his PhD from Tsinghua University in 2018. He is currently a research assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research field is intelligent interpretation of SAR images. E-mail: zhirui1990@126.com

    SUN Yuanrui was born in 1995. He received his bachelor’s degree in engineering from China University of Geosciences (Wuhan) in 2017. He is now a doctoral candidate in information and communication engineering of the University of Chinese Academy of Sciences. His main research field is SAR ship detection. E-mail: sunyuanrui17@mails.ucas.ac.cn

    DIAO Wenhui was born in 1988. He received his PhD from the University of Chinese Academy of Sciences in 2016. He is currently a research assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research interests include deep learning theory and its application in remote sensing image interpretation. E-mail: whdiao@mail.ie.ac.cn

    ZHANG Yue was born in 1990. He received his PhD from the University of Chinese Academy of Sciences in 2017. He is currently a research assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research field is intelligent analysis and interpretation of SAR images. E-mail: zhangyue@air.cas.ac.cn

    FU Kun was born in 1974. He is a researcher and doctoral supervisor. He is the president assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences, and the director of the Key Laboratory of the Chinese Academy of Sciences. He is mainly engaged in research in the fields of geospatial data analysis and mining, and remote sensing image intelligent interpretation. He has successively won the National Science and Technology Progress Award, the first prize of the National Science and Technology Progress Award, and the first prize of the Provincial and Ministerial-Level Award. E-mail: fukun@mail.ie.ac.cn

    Corresponding author: SUN Xian E-mail: sunxian@mail.ie.ac.cn
  • Figure  1.  Annotated example in the dataset

    Figure  2.  Examples of the AIR-SARShip-1.0 dataset

    Figure  3.  Area distribution of ship rectangle in the dataset

    Figure  4.  Imaging examples of the same area at different angles

    Figure  5.  Examples of original image and rotated images

    Figure  6.  Main structure of the DCENN network

    Figure  7.  Fusion feature map based on dense connection

    Figure  8.  Detection example of SAR ship based on Faster-RCNN

    Table  1.   The dataset information

    Resolution Imaging mode Polarization mode Format
    1 m, 3 m Spotlight, Strip Single Tiff
    下载: 导出CSV

    Table  2.   The performance benchmarks of classic ship detection algorithm

    Algorithm AP(%)
    CFAR 27.1
    CFAR method based on K distribution 19.2
    KSW 28.2
    下载: 导出CSV

    Table  3.   The performance benchmarks of SAR ship detection algorithms based on deep learning

    Performance ranking Algorithm AP(%) FPS
    1 DCENN 88.1 24
    2 Faster-RCNN-DR 84.2 29
    3 Faster-RCNN 79.3 30
    4 SSD-512 74.3 64
    5 SSD-300 72.4 151
    6 YOLOv1 64.7 160
    下载: 导出CSV

    Table  4.   The performance benchmarks of different scenes based on different algorithms

    Performance ranking Algorithm Nearshore ship AP(%) Offshore ship AP(%)
    1 DCENN 68.1 96.3
    2 Faster-RCNN-DR 57.6 94.6
    3 SSD-512 40.3 89.4
    下载: 导出CSV
  • [1] ZHANG Jie, ZHANG Xi, FAN Chenqing, et al. Discussion on Application of Polarimetric Synthetic Aperture Radar in Marine Surveillance[J]. Journal of Radars, 2016, 5(6): 596–606. doi:  10.12000/JR16124
    [2] REY M T, CAMPBELL J, and PETROVIC D. A comparison of ocean clutter distribution estimators for CFAR-based ship detection in RADARSAT imagery[R]. Defence Research Establishment, Report No.1340, 1998.
    [3] NOVAK L M, BURL M C, and IRVING W W. Optimal polarimetric processing for enhanced target detection[J]. IEEE Transactions on Aerospace and Electronic Systems, 1993, 29(1): 234–244. doi:  10.1109/7.249129
    [4] STAGLIANO D, LUPIDI A, and BERIZZI F. Ship detection from SAR images based on CFAR and wavelet transform[C]. 2012 Tyrrhenian Workshop on Advances in Radar and Remote Sensing, Naples, Italy, 2012: 53–58.
    [5] HE Jinglu, WANG Yinghua, LIU Hongwei, et al. A novel automatic PolSAR ship detection method based on superpixel-level local information measurement[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(3): 384–388. doi:  10.1109/LGRS.2017.2789204
    [6] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 248–255.
    [7] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (voc) challenge[J]. International Journal of Computer Vision, 2009, 88(2): 303–338. doi:  10.1007/s11263-009-0275-4
    [8] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 740–755.
    [9] XIA Guisong, BAI Xiang, DING Jian, et al. DOTA: A large-scale dataset for object detection in aerial images[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 3974–3983.
    [10] ZHANG Yuanlin, YUAN Yuan, FENG Yachuang, et al. Hierarchical and robust convolutional neural network for very high-resolution remote sensing object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(8): 5535–5548. doi:  10.1109/TGRS.2019.2900302
    [11] LONG Yang, GONG Yiping, XIAO Zhifeng, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(5): 2486–2498. doi:  10.1109/TGRS.2016.2645610
    [12] XIAO Zhifeng, LIU Qing, TANG Gefu, et al. Elliptic fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J]. International Journal of Remote Sensing, 2015, 36(2): 618–644. doi:  10.1080/01431161.2014.999881
    [13] LI Jianwei, QU Changwen, and SHAO Jiaqi. Ship detection in SAR images based on an improved faster R-CNN[C]. 2017 SAR in Big Data Era: Models, Beijing, China, 2017: 1–6.
    [14] HUANG Lanqing, LIU Bin, LI Boying, et al. OpenSARShip: A dataset dedicated to Sentinel-1 ship interpretation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2017, 11(1): 195–208. doi:  10.1109/JSTARS.2017.2755672
    [15] WANG Yuanyuan, WANG Chao, ZHANG Hong, et al. A SAR dataset of ship detection for deep learning under complex backgrounds[J]. Remote Sensing, 2019, 11(7): 765. doi:  10.3390/rs11070765
    [16] ZHANG Qingjun. System Design and Key Technologies of the GF-3 Satellite[J]. Acta Geodaetica et Cartographica Sinica, 2017, 46(3): 269–277. doi:  10.11947/j.AGCS.2017.20170049
    [17] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37.
    [18] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788.
    [19] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2999–3007.
    [20] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580–587.
    [21] GIRSHICK R. Fast R-CNN[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440–1448.
    [22] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]. The 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: 91–99.
    [23] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 936–944.
    [24] LIU Peng and JIN Yaqiu. A study of ship rotation effects on SAR image[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(6): 3132–3144. doi:  10.1109/TGRS.2017.2662038
    [25] JIAO Jiao, ZHANG Yue, SUN Hao, et al. A densely connected end-to-end neural network for multiscale and multiscene SAR ship detection[J]. IEEE Access, 2018, 6: 20881–20892. doi:  10.1109/ACCESS.2018.2825376
  • [1] 安道祥, 陈乐平, 冯东, 黄晓涛, 周智敏.  机载圆周SAR成像技术研究 . 雷达学报, doi: 10.12000/JR20026
    [2] 张天贤, 夏香根.  OFDM SAR成像方法综述 . 雷达学报, doi: 10.12000/JR19116
    [3] 陈小龙, 陈唯实, 饶云华, 黄勇, 关键, 董云龙.  飞鸟与无人机目标雷达探测与识别技术进展与展望 . 雷达学报, doi: 10.12000/JR20068
    [4] 罗迎, 倪嘉成, 张群.  基于“数据驱动+智能学习”的合成孔径雷达学习成像 . 雷达学报, doi: 10.12000/JR19103
    [5] 郭炜炜, 张增辉, 郁文贤, 孙效华.  SAR图像目标识别的可解释性问题探讨 . 雷达学报, doi: 10.12000/JR20059
    [6] 马琳, 潘宗序, 黄钟泠, 韩冰, 胡玉新, 周晓, 雷斌.  基于子孔径与全孔径特征学习的SAR多通道虚假目标鉴别 . 雷达学报, doi: 10.12000/JR20106
    [7] 戴牧宸, 冷祥光, 熊博莅, 计科峰.  基于改进双边网络的SAR图像海陆分割方法 . 雷达学报, doi: 10.12000/JR20089
    [8] 郭倩, 王海鹏, 徐丰.  SAR图像飞机目标检测识别进展 . 雷达学报, doi: 10.12000/JR20020
    [9] 刘宁, 赵博, 黄磊.  单通道SAR抗欺骗干扰方法 . 雷达学报, doi: 10.12000/JR18072
    [10] 陈慧元, 刘泽宇, 郭炜炜, 张增辉, 郁文贤.  基于级联卷积神经网络的大场景遥感图像舰船目标快速检测方法 . 雷达学报, doi: 10.12000/JR19041
    [11] 张金松, 邢孟道, 孙光才.  一种基于密集深度分离卷积的SAR图像水域分割算法 . 雷达学报, doi: 10.12000/JR19008
    [12] Wang Yong, Chen Xuefei.  Three-dimensional Geometry Reconstruction of Ship Targets with Complex Motion for Interferometric ISAR with Sparse Aperture . 雷达学报, doi: 10.12000/JR18019
    [13] 王俊, 郑彤, 雷鹏, 魏少明.  深度学习在雷达中的研究综述 . 雷达学报, doi: 10.12000/JR18040
    [14] 苏宁远, 陈小龙, 关键, 牟效乾, 刘宁波.  基于卷积神经网络的海上微动目标检测与分类方法 . 雷达学报, doi: 10.12000/JR18077
    [15] 赵飞翔, 刘永祥, 霍凯.  一种基于Dropout约束深度极限学习机的雷达目标分类算法 . 雷达学报, doi: 10.12000/JR18048
    [16] Li Jianbing, Gao Hang, Wang Tao, Wang Xuesong.  A Survey of the Scattering Characteristics and Detection of Aircraft Wake Vortices . 雷达学报, doi: 10.12000/JR17068
    [17] 赵飞翔, 刘永祥, 霍凯.  基于栈式降噪稀疏自动编码器的雷达目标识别方法 . 雷达学报, doi: 10.12000/JR16151
    [18] 徐丰, 王海鹏, 金亚秋.  深度学习在SAR目标识别与地物分类中的应用 . 雷达学报, doi: 10.12000/JR16130
    [19] 洪文, 林赟, 谭维贤, 王彦平, 向茂生.  地球同步轨道圆迹SAR研究 . 雷达学报, doi: 10.12000/JR15062
    [20] 邓云凯, 王宇.  先进双基SAR 技术研究(英文) . 雷达学报, doi: 10.3724/SP.J.1300.2014.14026
  • 加载中
图(8) / 表 (4)
计量
  • 文章访问数:  5234
  • HTML全文浏览量:  4567
  • PDF下载量:  1105
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-11-16
  • 修回日期:  2019-12-17
  • 网络出版日期:  2019-12-27

AIR-SARShip-1.0: High-resolution SAR Ship Detection Dataset

(English)

doi: 10.12000/JR19097
    基金项目:  The National Natural Science Foundation of China (61725105, 41801349, 41701508), National Major Project on High Resolution Earth Observation System (GFZX0404120405)
    作者简介:

    SUN Xian was born in 1981. He is a researcher and doctoral supervisor at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research fields are computer vision and remote sensing image interpretation. E-mail: sunxian@mail.ie.ac.cn

    WANG Zhirui was born in 1990. He received his PhD from Tsinghua University in 2018. He is currently a research assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research field is intelligent interpretation of SAR images. E-mail: zhirui1990@126.com

    SUN Yuanrui was born in 1995. He received his bachelor’s degree in engineering from China University of Geosciences (Wuhan) in 2017. He is now a doctoral candidate in information and communication engineering of the University of Chinese Academy of Sciences. His main research field is SAR ship detection. E-mail: sunyuanrui17@mails.ucas.ac.cn

    DIAO Wenhui was born in 1988. He received his PhD from the University of Chinese Academy of Sciences in 2016. He is currently a research assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research interests include deep learning theory and its application in remote sensing image interpretation. E-mail: whdiao@mail.ie.ac.cn

    ZHANG Yue was born in 1990. He received his PhD from the University of Chinese Academy of Sciences in 2017. He is currently a research assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences. His main research field is intelligent analysis and interpretation of SAR images. E-mail: zhangyue@air.cas.ac.cn

    FU Kun was born in 1974. He is a researcher and doctoral supervisor. He is the president assistant at the Aerospace Information Research Institute, Chinese Academy of Sciences, and the director of the Key Laboratory of the Chinese Academy of Sciences. He is mainly engaged in research in the fields of geospatial data analysis and mining, and remote sensing image intelligent interpretation. He has successively won the National Science and Technology Progress Award, the first prize of the National Science and Technology Progress Award, and the first prize of the Provincial and Ministerial-Level Award. E-mail: fukun@mail.ie.ac.cn

    通讯作者: SUN Xian E-mail: sunxian@mail.ie.ac.cn
  • 中图分类号: TN957.51; TN958

English Abstract

SUN Xian, WANG Zhirui, SUN Yuanrui, et al. AIR-SARShip-1.0: High-resolution SAR ship detection dataset[J]. Journal of Radars, 2019, 8(6): 852–862. DOI:  10.12000/JR19097
Citation: SUN Xian, WANG Zhirui, SUN Yuanrui, et al. AIR-SARShip-1.0: High-resolution SAR ship detection dataset[J]. Journal of Radars, 2019, 8(6): 852–862. DOI:  10.12000/JR19097
    • Synthetic Aperture Radar (SAR) is an active microwave imaging radar that can provide all-weather, day-and-night imaging capability. SAR has broad application prospects in the military and civilian fields. With the development of China’s ground observation technology in recent years, many high-resolution SAR satellites, such as Gaofen-3, have been put into use. The quality and quantity of SAR data both continue to increase.

      The interpretation of SAR images faces many challenges. SAR imaging and optical imaging are different. For example, characterization is not intuitive, and coherent spots and overlays during imaging usually interfere with object interpretation. Most existing daily operations use manual interpretation, which is time-consuming and labor-intensive. Furthermore, meeting the needs of real-time interpretation of massive SAR images is difficult.

      Continuous monitoring of ships in ports and maritime areas is an important application task[1]. Ship detection has also been a research focus in the field of SAR image interpretation. Ship detection is divided into two types: nearshore ship detection and offshore ship detection. In general, the background of offshore ships is relatively unvarying, thus resulting in slightly less difficulty in extracting foreground objects. By contrast, the nearshore area has a larger number and a wider variety of ships. In addition, ports are located in the land–sea division area and are affected by background noise and ground interference. Therefore, detecting nearshore ships is more difficult.

      The classic ship detection method mainly combines statistical learning with the Constant False Alarm Rate (CFAR). In research on ship detection for single polarized SAR, Rey proposed a detection method that uses the K distribution of an ocean clutter model combined with CFAR[2]. Novaket al[3]. developed a two-parameter CFAR by using a Gaussian model. Stagliano et al[4]. proposed a ship detection algorithm for SAR, which is based on the combination of CFAR and wavelet transform. He Jinglu et al[5]. proposed an automatic ship detection method for polarized SAR based on superpixel-level local information measurement. The algorithm calculates the measured values between the superpixel and the surrounding pixels by generating multiscale superpixels and converts different metrics from the superpixel level to the pixel level for discrimination and detection. The traditional methods have been widely used in the ship detection field and rely on artificially designed feature classifiers to extract ship features. For example, the performance of the CFAR algorithm depends on the modeling of ocean clutter. Meanwhile, artificially designed feature classifiers often work well in the offshore ship detection of ocean ships with relatively unvarying backgrounds. When it comes to the nearshore scenario, the performance deteriorates because the traditional methods cannot fully distinguish the real ships and the false alarm targets, such as islands, reefs, and man-made facilities near the shore.

      In recent years, with the progressive development of deep learning methods, many target detection algorithms using deep neural network models have been proposed, thereby improving the deficiencies of traditional learning methods to a certain extent. Commonly used network models include autoencoders, Boltzmann machines, and Convolutional Neural Networks (CNN). For CNN in particular, many basic networks have emerged, such as Alex Network (AlexNet), VGG, Google Network (GoogleNet), and Residual Network (ResNet). Many target detection models based on this structure have been developed, together with some classic detection models, including SSD, YOLOv1, and Faster-region CNNs (Faster-RCNNs). These methods have gradually become the mainstream in the field of SAR ship detection.

      However, deep learning methods often require large amounts of data for training. In the field of computer vision, many public sample datasets are available, such as ImageNet[6], VOC[7], and COCO[8]. The data scale reaches thousands of types of targets and millions of slices. In the past two years, some datasets such as DOTA[9], HRRSD[10], and RSOD[11,12] have been released successively in the field of optical remote sensing, thereby facilitating the research and test of many algorithms.

      By contrast, the existing datasets in the field of SAR ship detection are relatively limited. Publicly available datasets include SSDD[13], OpenSARShip[14], and the dataset provided in Ref. [15]. These three types of datasets are mainly based on civilian ship slices. The slice size is generally 256 × 256 pixels, and the resolution includes 3 m, 5 m, 8 m, 10 m, and 20 m. Most backgrounds feature offshore scenarios, and the nearshore scenario is limited. After the release of these three datasets, the application of the deep neural network models in ship detection for SAR has been better promoted. The benchmarks of the datasets are defined based on the mainstream deep learning algorithms.

      In actual application, the ship detection task is often realized on the whole scene image, whose coverage area is generally tens of square kilometers or more. Under this condition, the environment around the target, such as docks, roads, outbuildings, and even waves, has a great impact on ship detection performance, especially in the nearshore and island reef scenarios. Therefore, a dataset that contains more realistic and diverse scenarios such as the distant sea and near shore and covers multiple types of ship targets will contribute to training a model with better performance, stronger robustness, and higher practicality.

      To promote research on ship detection for SAR and improve the utilization rate of localized data, this paper publishes AIR-SARShip-1.0, a SAR ship dataset based on Gaofen-3 satellite data. The dataset contains 31 SAR images. The scene types include ports, islands, reefs, and the sea. The labeling information is the ship position, which has been confirmed by professional interpreters. Currently, the dataset is mainly used for ship target detection in complex scenarios. The dataset is free to download from the link on the official website of the Journal of Radar. The paper also uses several common deep learning networks for comparative experiments and analysis. The performance indexes form a benchmark for SAR ship detection, which is convenient for other scholars to cite as a reference in related research.

    • Gaofen-3 is a civilian microwave remote sensing imaging satellite in the major project of the National High-Resolution Earth Observation System, and it is also the first Chinese C-band multipolarization high-resolution synthetic aperture radar satellite[16]. The AIR-SARShip-1.0 dataset is collected from the Gaofen-3 satellite, which contains 31 images of large scenes. The detailed information of this dataset is shown in Tab. 1. The resolution of SAR images includes 1 m and 3 m. The imaging mode has both spotlight and strip modes. All images are in single polarization mode with a size of about 3000 × 3000 pixels and all saved in TIF format. Details of each image, including image number, pixel size, resolution, sea state, scene type, and the number of ships, are presented in the App. Tab. 1 of this paper.

      Table 1.  The dataset information

      Resolution Imaging mode Polarization mode Format
      1 m, 3 m Spotlight, Strip Single Tiff
    • The AIR-SARShip-1.0 dataset is labeled according to the annotation format in the PASCAL VOC dataset, and the results are saved as XML files. Fig. 1(a) shows an example of some annotated rectangular boxes, and Fig. 1(b) shows partial details of the corresponding XML label file. The file in Fig. 1(b) actually contains the rectangle information of all ships in Fig. 1(a), but here, we list only the information of one target box as an example. The XML file includes image file name, pixel size, number of channels, resolution, category, and position of each target box. The top left corner of the SAR image is set as the origin of coordinates. Each target is labeled by a rectangular box, which is located in the top left corner (xmin, ymin) and the bottom right corner (xmax, ymax). The coordinates of these two points are the actual pixel position in the image. The annotation format in this dataset is consistent with that in the PASCAL VOC dataset. Fig. 2 presents some typical scenes in this dataset. The SAR images also contain the surrounding harbors, sea, and inland area, which is close to the real ship detection task.

      Figure 1.  Annotated example in the dataset

      Figure 2.  Examples of the AIR-SARShip-1.0 dataset

      The setting of the proportion of training and test sets is important in the process of training. Considering that this dataset contains 31 large-scene images, we take 21 images as training data and the remaining 10 images as test data. The area distribution of the bounding box is shown in Fig. 3. The horizontal axis represents the area of the bounding box, and the vertical axis is the proportion of total ships in the corresponding area. For example, the first column represents that 6% of ships have an area less than 1000, and the second column represents that 13% of ships have an area between 1000 and 2000. Fig. 3 shows that most target areas are between 2000 and 5000, and the ratios are small in the whole image. Even if the large image is cropped into slices with 500 × 500 pixels, the average area ratio of ship targets in the slice is between 0.008 and 0.020. Compared with the COCO dataset with about 41% small targets, being one of the most challenging datasets in the domain of computer vision, AIR-SARShip-1.0 has more small targets in large scenes. Therefore, AIR-SARShip-1.0 mainly focuses on the detection performance of targets in a small scale.

      Figure 3.  Area distribution of ship rectangle in the dataset

    • Before deep learning became popular, researchers from all over the world conducted in-depth research on the SAR ship detection field and proposed many classical SAR ship detection algorithms, such as the two-parameter CFAR algorithm, the optimal entropy automatic threshold method (KSW), and the CFAR method based on K distribution. The optimal entropy automatic threshold method applies the Shannon entropy in information theory to image segmentation. This algorithm overcomes the problem of ship detection disconnection and false alarm detection in high-resolution images by selecting double thresholds. The CFAR detection method is one of the most commonly used and effective detection algorithms in the field of radar signal detection. The core idea of this algorithm is to calculate the threshold for detecting ship targets on the basis of the CFAR and the statistical characteristics of ocean clutter in SAR images, i.e., the probability density function of ocean clutter. When the ocean background clutter is modeled by the Gaussian model, a double-parameter CFAR algorithm can be established. However, in many cases, the Gaussian model is not ideal for describing ocean clutter. Hence, in 1976, Jakeman and Pusey introduced the concept of K distribution to describe ocean clutter, that is, the CFAR method based on the K distribution further improves the accuracy of ship detection, which is universally accepted. In the experimental part of this paper, three classical ship detection algorithms are used to test and analyze the AIR-SARShip-1.0 dataset.

    • In recent years, with the development of deep learning technology, many algorithms have been proposed for object detection in the visual field, which are mainly divided into two categories: single-stage target detector and double-stage target detector. SSD[17], YOLOv1[18], and RetinaNet[19] are the representative single-stage detection algorithms. YOLOv1 contains only two parts: the feature extraction part and the detection object box part. YOLOv1 divides the image into S×S grids. The center of the grid where each object is located is responsible for predicting the position and category of the object box and can predict only the object of a single class. The difference between SSD and YOLOv1 is that SSD adds an anchor frame and a multiscale feature extraction layer, thereby improving YOLOv1’s rough grid and poor detection accuracy for small targets. The two-stage representative algorithms include R-CNN[20], fast-RCNN[21], Faster-RCNN[22], and feature pyramid networks[23]. The most representative Faster-RCNN consists of three parts. Part 1 is the basic network, which is used to extract high-level features from the image. Part 2 is the Region Proposal Network (RPN), which proposes candidate boxes that may be targets. Part 3 is the prediction box regression network, which further classifies the target based on the candidate box and performs the location regression. The two-stage detection network has the extraction part of candidate boxes, thus making it better than the single-stage detection network in controlling the proportion of positive and negative samples and refining the position of candidate boxes. However, it also greatly increases the time cost of detection.

      Target detection algorithms in the visual domain have similar basic networks, such as VGG and ResNet. VGG is mainly divided into two parts: convolutional network and fully connected network. ResNet is mainly used to solve the problem of the network performance degrading as the network depth increases. It cleverly designs a jumper module to form a residual block, which greatly increases the available network depth. Commonly used ResNet networks include ResNet50, ResNet101, and ResNet152.

    • At present, the main means of data enhancement in are turnover, random image scaling and 90-degree rotation. However, SAR satellites often conduct multitemporal and multiangle imaging of the same location, but this angle is uncertain, being neither a 90-degree rotation nor a 180-degree flip. As shown in Fig. 4, the two SAR images from the same place look different because of the imaging angle. SAR imaging is different from optical imaging, and the imaging results from different angles are different[24]. Using only a rotation of 90 degrees in the data enhancement method limits the detection performance improvement. To solve this problem, this paper adopts the Faster-RCNN based on dense rotation (Faster-RCNN-DR) enhancement with a small angle interval to obtain the diversity of data angles to further improve the performance of SAR ship target detection. Fig. 5 shows the original image and the image after 20°, 40°, and 60° counterclockwise rotation.

      Figure 4.  Imaging examples of the same area at different angles

      Figure 5.  Examples of original image and rotated images

    • SAR images have diverse resolutions according to different applications and imaging modes. The same ship object in different-resolution images and different ship objects in the same resolution image will show different sizes. The multiscale features of ships in the multiresolution SAR images pose great challenges to object detection. In DCNNs, the feature maps of low-level convolutional layers contain rich spatial information, but less semantic information is available. The feature maps of the higher layers contain more semantic information but less spatial information. Smaller-scale objects are left with little information after multilayer convolution, which is not good for small object detection and recognition. Therefore, to solve the multiscale ship object detection problem of SAR images with different resolutions, the Ref. [25] proposed a Densely Connected End-to-end Neural Network (DCENN) for ship detection. The main structure of this network, which uses ResNet101 as the backbone network, is shown in Fig. 6. As the convolutional network deepens and the image has been convolved multiple times, the feature map contains an increasing amount of semantic information, but the resolution continuously becomes lower. To combine the high-resolution feature map with the semantic information of the high-level feature map, the high-level and low-level feature maps can be iteratively connected as shown in Fig. 7. After the basic network and RPN network are established, a two-stage detection subnetwork is formed (shown in the dashed box in Fig. 6). This subnetwork is specifically divided into the proposal box pooling part and the fully connected layer part for classification and regression. Lightweighting and improving these two parts not only guarantee detection accuracy but also reduce the memory consumption and improve the processing speed.

      Figure 6.  Main structure of the DCENN network

      Figure 7.  Fusion feature map based on dense connection

    • We conducted experiments on the AIR-SARShip-1.0 dataset to verify the superiority of the deep learning methods. An Intel Xeon E5-2630 CPU with an Ubuntu 16.04 operating system, a 32 GB memory, and an NVIDIA Tesla P100 GPU for the deep learning algorithm was used for the experiments. The traditional algorithms use the CPU for calculation instead of the GPU for acceleration. The dataset is divided into the test set, which has 10 images, and the training set, which has 21 images. The dataset will provide train.txt and test.txt files to record the names of the training file and the test set file, respectively. In the CFAR algorithm, the ocean clutter is considered to obey the Gaussian distribution of (0,1). In the CFAR algorithm based on the K distribution, the parameter K is 2. In the KSW algorithm, the optimal threshold parameter is automatically selected according to the image. The traditional algorithms do not require training data and are thus directly tested based on the test set. The test accuracy is shown in Tab. 2.

      Table 2.  The performance benchmarks of classic ship detection algorithm

      Algorithm AP(%)
      CFAR 27.1
      CFAR method based on K distribution 19.2
      KSW 28.2

      The calculation method of AP is shown in Eq. (1), and the calculation method of pinterp(rn+1) in Eq. (1) is shown in Eq. (2). p( $ \tilde {r} $ ) in Eq. (2) represents the maximum precision under the recall $ \tilde {r} $ . The calculation of the precision p and the recall $ \tilde {r} $ are shown in Eq. (3) and Eq. (4). TP represents the number of detection bounding boxes where the detection result is true and the ground truth is true. FP represents the number of detection bounding boxes where the detection result is true but the ground truth is false. FN represents the number of detection bounding box where the detection result is false but the ground truth is true. As shown in Eq. (5), the Intersection Over Union (IOU) is defined as the area of the overlapping part of the detection bounding box and the ground truth divided by the area of the unified part of the two. When the IOU is greater than 0.5, the detection is successful, as recorded in TP. When the IOU is less than 0.5, it is considered a false alarm, as recorded in FP. The undetected ships are recorded in FN. Given that this dataset contains only the ship category, the mAP, which represents all classes’ AP average, is the same as the AP.

      $$ \quad {\rm{AP}} = \sum\limits_0^1 {({r_{n + 1}} - {r_n}){p_{\rm{{interp}}}}({r_{n + 1}})} $$ (1)
      $$ \quad {p_{{\rm{interp}}}}({r_{n + 1}}) = \mathop {\max }\limits_{\tilde r:\tilde r \ge {r_{n + 1}}} p(\tilde r) $$ (2)
      $$ \quad p = \frac{{{\rm{TP}}}}{{{\rm{TP}} + {\rm{FP}}}} $$ (3)
      $$ \quad r = \frac{{{\rm{TP}}}}{{{\rm{TP}} + {\rm{FN}}}} $$ (4)
      $$ {\rm{IOU}} = \frac{{{\rm{area}}({{\rm{B}}_p} \cap {\rm{B}}{}_{gt})}}{{{\rm{area}}({{\rm{B}}_p} \cup {\rm{B}}{}_{gt})}} $$ (5)

      In the visual field, the deep learning target detection algorithms SSD, YOLOv1, and Faster-RCNN and the detection network algorithm based on rotation enhancement are tested by using the open-source framework PyTorch. Jiao Jiao et al. used the DCENN algorithm in experiments with the open-source framework TensorFlow. In the experiment, the SAR image is divided into 500 × 500 pixels, and then the data are enhanced by image flipping, image rotation, contrast enhancement, and random scaling. The training sets used by the Faster-RCNN, SSD-512, SSD-300, and YOLOv1 algorithms are enhanced by 90-degree rotation, while the training sets used by the Faster-RCNN-based rotation enhancement algorithm are enhanced by dense rotation at 10-degree intervals. Two image sizes—SSD-300 and SSD-512—are used in SSD. In the experiment, the learning rate is 0.00001 and the momentum is set to 0.99. According to the memory limit of GPU, the batch processing capacity of SSD-300, SSD-512, Faster-RCNN, and DCENN are 24, 4, 12, and 12, respectively. Other hyperparameters are set the same as those in Ref. [22]. The hyperparameter setting in Faster-RCNN-DR is exactly the same as those in Faster-RCNN.

      The ship detection performance of each deep learning algorithm is shown in Tab. 3, in which the running speed of each algorithm is measured by FPS, which represents the number of images that the algorithm can detect per second. The input test image size of DCENN, Faster-RCNN-DR, Faster-RCNN, and YOLOv1 is 500 × 500. The input test image size of SSD-512 is 512 × 512, and that of SSD-300 is 300 × 300. The table shows that among the algorithms whose training set was enhanced by 90-degree rotation, YOLOv1 has the worst performance but the fastest running speed, while the SAR ship detection algorithm proposed in Ref. [25] has the best performance but the slowest running speed. In the single-stage target detection algorithm, YOLOv1 does not use an anchor frame to predict. Instead, it divides the image into S × S grid, with each grid being able to predict only one target. Thus, YOLOv1 in the dense small-target dataset of AIR-SARShip-1.0 has poor detection performance. However, YOLOv1 has the fastest speed due to the removal of the anchor frame. SSD adds an anchor frame during training and forecasts it in multiple feature layers of the network, which makes up for the deficiency of YOLOv1 and improves the detection performance. However, its running time is slightly slower than that of YOLOv1. As a typical two-stage detection algorithm, Faster-RCNN uses RPN to propose candidate boxes to make the latter network detect the position of the regression target box more accurately. Faster-RCNN performs better than the single-stage detection algorithm. However, it also has the shortcomings of the two-stage algorithm. The running speed is obviously slower than that of the single-stage algorithm. Compared with Faster-RCNN, Faster-RCNN-DR increases the performance by 4.9% because the dense rotation method improves the richness and angle diversity of the dataset to a certain extent. Given that no additional calculation is performed in the test stage, the running time is basically the same as that of the Faster-RCNN detection algorithm. The DCENN ship detection algorithm can better extract ship features because of the use of dense connections and prediction on multiple feature layers. Hence, this algorithm has the best performance. Nevertheless, the dense connection also requires a high computational amount, thereby lowering the processing efficiency.

      Table 3.  The performance benchmarks of SAR ship detection algorithms based on deep learning

      Performance ranking Algorithm AP(%) FPS
      1 DCENN 88.1 24
      2 Faster-RCNN-DR 84.2 29
      3 Faster-RCNN 79.3 30
      4 SSD-512 74.3 64
      5 SSD-300 72.4 151
      6 YOLOv1 64.7 160

      In Tab. 4, the detection results of the three representative algorithms are given in two different scenarios: nearshore and offshore. The detection accuracy of offshore scenario is obviously higher than that of the nearshore scenario. The highest accuracy in the offshore scenario on this dataset is better than 95%, while the performance of nearshore scenario is reduced by more than 20%. This phenomenon accords with the fact that the background of the offshore scenario is relatively unvarying and less noisy, whereas the nearshore scenario is interfered by wharfs, buildings, and land. To a certain extent, this finding also shows that a large gap still exists between scientific research on nearshore ship target detection and practical utilization, which is a challenging research topic.

      Table 4.  The performance benchmarks of different scenes based on different algorithms

      Performance ranking Algorithm Nearshore ship AP(%) Offshore ship AP(%)
      1 DCENN 68.1 96.3
      2 Faster-RCNN-DR 57.6 94.6
      3 SSD-512 40.3 89.4

      To show the detection effect on the AIR-SARShip-1.0 data set intuitively, we take one SAR image as an example and use the Faster-RCNN algorithm to detect ships. The results are shown in Fig. 8. The number in the green box represents the confidence of the detection box. Most of the ships are detected with correct rectangles as shown in Fig. 8(c). However, some unsatisfactory results still exist, i.e., false alarm (Fig. 8(a)), the detected ship with an inaccurate rectangle (Fig. 8(b)), and the overlooked ship (Fig. 8(d)), thus indicating that further research and improvement are needed.

      Figure 8.  Detection example of SAR ship based on Faster-RCNN

    • To promote the application of deep learning technology in the field of SAR ship detection, this paper publishes a large-scale high-resolution dataset called AIR-SARShip-1.0, which includes two scenarios, i.e., nearshore and offshore. In this paper, both the traditional ship detection algorithms and the common deep learning detection algorithms are experimentally tested. The deep learning algorithms have significantly better detection performance than the traditional ship algorithms. On the basis of densely connected network structures, the DCENN detection algorithm uses multiple connections for prediction. DCENN achieves the highest AP performance, but its operation speed is the slowest. The data expansion method using dense angular rotation can increase the angular diversity of the data to a certain extent, which is conducive to the improvement of model performance without bringing extra calculation in the prediction. In addition, different algorithms are tested in the nearshore and offshore scenarios. The performance difference is small in the offshore scenario but is significant in the nearshore scenario, thereby indicating that the nearshore environment is more complicated and that the ship detection task faces more challenges. The experimental results establish a performance benchmark for the AIR-SARShip-1.0 dataset, which conveniently enables other scholars to further perform related research on SAR ship detection.

    • The high-resolution SAR ship detection dataset -1.0 (AIR-SARShip-1.0) is published on the official website of the Journal of Radar and has been uploaded to the “data / SAR sample data set” page (App. Fig. 1) at http://radars.ie.ac.cn/web/data/getData?dataType=SARDataset.

      To increase the utilization rate of domestic data and promote research on advanced technologies, such as SAR target detection, the AIR-SARShip-1.0 dataset was built based on the major scientific and technological specialties of the National High-Resolution Earth Observation System. This dataset has large-scene images and covers typical types of ships, which is close to practical applications. AIR-SARShip-1.0 is owned by the National Science and Technology Major Project of High-Resolution Earth Observation System and the Aerospace Information Research Institute, Chinese Academy of Sciences. The editorial department of the Journal of Radar has editorial rights.

参考文献 (25)

目录

    /

    返回文章
    返回