A generic Self-Supervised Learning (SSL) framework for representation learning from spectral–spatial features of unlabeled remote sensing imagery

Zhang, Xin ORCID: https://orcid.org/0000-0001-7844-593X and Han, Liangxiu ORCID: https://orcid.org/0000-0003-2491-7473 (2023) A generic Self-Supervised Learning (SSL) framework for representation learning from spectral–spatial features of unlabeled remote sensing imagery. Remote Sensing, 15 (21). 5238. ISSN 2072-4292

Preview

Published Version
Available under License Creative Commons Attribution.
Download (2MB) | Preview

Official URL: http://dx.doi.org/10.3390/rs15215238

Abstract

Remote sensing data has been widely used for various Earth Observation (EO) missions such as land use and cover classification, weather forecasting, agricultural management, and environmental monitoring. Most existing remote-sensing-data-based models are based on supervised learning that requires large and representative human-labeled data for model training, which is costly and time-consuming. The recent introduction of self-supervised learning (SSL) enables models to learn a representation from orders of magnitude more unlabeled data. The success of SSL is heavily dependent on a pre-designed pretext task, which introduces an inductive bias into the model from a large amount of unlabeled data. Since remote sensing imagery has rich spectral information beyond the standard RGB color space, it may not be straightforward to extend to the multi/hyperspectral domain the pretext tasks established in computer vision based on RGB images. To address this challenge, this work proposed a generic self-supervised learning framework based on remote sensing data at both the object and pixel levels. The method contains two novel pretext tasks, one for object-based and one for pixel-based remote sensing data analysis methods. One pretext task is used to reconstruct the spectral profile from the masked data, which can be used to extract a representation of pixel information and improve the performance of downstream tasks associated with pixel-based analysis. The second pretext task is used to identify objects from multiple views of the same object in multispectral data, which can be used to extract a representation and improve the performance of downstream tasks associated with object-based analysis. The results of two typical downstream task evaluation exercises (a multilabel land cover classification task on Sentinel-2 multispectral datasets and a ground soil parameter retrieval task on hyperspectral datasets) demonstrate that the proposed SSL method learns a target representation that covers both spatial and spectral information from massive unlabeled data. A comparison with currently available SSL methods shows that the proposed method, which emphasizes both spectral and spatial features, outperforms existing SSL methods on multi- and hyperspectral remote sensing datasets. We believe that this approach has the potential to be effective in a wider range of remote sensing applications and we will explore its utility in more remote sensing applications in the future.

Item Type:	Article (Article)
Peer-reviewed:	Yes
Date Deposited:	16 Nov 2023 10:44
Publisher:	MDPI AG
Additional Information:	This is an open access article which originally appeared in Remote Sensing, published by MDPI
Divisions:	Faculties > Science and Engineering
Subject terms:	0203 Classical Physics, 0406 Physical Geography and Environmental Geoscience, 0909 Geomatic Engineering
Data Access Statement:	Publicly available datasets were analyzed in this study.
URI:	https://e-space.mmu.ac.uk/id/eprint/633172
DOI:	https://doi.org/10.3390/rs15215238
ISSN	2072-4292
e-ISSN	2072-4292

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

179Downloads

6 month trend

101Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Repository staff only

Edit record