key: cord-0058549-w4qp3ei5 authors: De Luca, Giandomenico; Modica, Giuseppe; Fattore, Carmen; Lasaponara, Rosa title: Unsupervised Burned Area Mapping in a Protected Natural Site. An Approach Using SAR Sentinel-1 Data and K-mean Algorithm date: 2020-08-26 journal: Computational Science and Its Applications - ICCSA 2020 DOI: 10.1007/978-3-030-58814-4_5 sha: 9757e77b06bfd541d1723d329a5c2fff00e0340f doc_id: 58549 cord_uid: w4qp3ei5 This paper is focused on investigating the capabilities of SAR S-1 sensors for burned area mapping. To this aim, we analyzed S-1 data focusing on a fire that occurred on August 10(th,) 2017, in a protected natural site. An unsupervised classification, using a k-mean machine learning algorithm, was carried out, and the choice of an adequate number of clusters was guided by the calculation of the silhouette score. The ΔNBR index calculated from optical S-2 based images was used to evaluate the burned area delimitation accuracy. The fire covered around 38.51 km(2) and also affected areas outside the boundaries of the reserve. S-1 based outputs successfully matched the S-2 burnt mapping. In the Mediterranean Basin, wildfires represent one of the primary disturbances, which causes economic damages and significant changes in landscapes, forests, and environmental ecosystems [1] . In the south of Italy, the 2017 fire-season has been one of the worse of the last decades because of the large number of burned woodlands and their extent [2, 3] , also favored by the drought climatic trend of that year. Protected natural sites are environments characterized by higher ecological value and biodiversity and are among the most vulnerable contexts. For this reason, the aspect concerning post-fire management is one of the most critical issues for the safeguarding of flora and fauna. Timely and accurate detection and quantification of burned areas are the first operational step after the fire occurrence for assessing the damage and addressing the post-event management [3] [4] [5] [6] . Moreover, this step is fundamental also for the subsequent implementing medium and long-term territorial planning strategies, in order to predict and manage irreversible forest and landscape degradation processes, and land cover changing [7] [8] [9] [10] [11] [12] . In this context, remote sensing (RS) techniques provide reliable tools and techniques for monitoring and quantifying the location and the impact of burned areas with particular reference to satellite platforms [4] . The majority of studies on the detection of burned areas has been based on multi-spectral (MS) (i.e., optical) satellite data [4] . However, these platforms are sensitive to some environmental conditions, such as sunlight and cloud cover, representing a limitation, especially for areas frequently affected by the presence of clouds [3, 13] . Therefore, methods based on data acquired by cloudindependent sensors at high spatial and temporal resolutions are needed. Among them, Synthetic Aperture Radar (SAR) sensors provide an efficient system for discriminating events that cause changes in objects on the Earth's surface [4, [14] [15] [16] [17] [18] . To detect burned areas, radar technology exploits the variations in microwaves backscatter caused by the modification of the structure and moisture content of the vegetation cover and soil [6, 15, [19] [20] [21] . However, due to its intrinsic characteristics and the high sensitivity of the radar signal to surface conditions, it can be altered and rich in so-called "noises" [19, 22] . Referring to the SAR data, a significant parameter characterizing them is the used polarization [20] . Its complexity of processing causes that this sensor is not widely used compared to optical ones [23] . There are several space missions that provide satellite constellations operating SAR imaging useful for Earth's environmental and fire monitoring purposes [4] . Copernicus missions by European Space Agency (ESA) provides free high spatial and temporal resolution data both from SAR sensors (Sentinel-1, S-1) and from MS optical sensors (Sentinel-2, S-2) (https://sentinel.esa.int/web/sentinel/home). The S-1 constellation comprises two polar-orbiting satellites (S-1A and S-1B) performing Cband radar imaging. The aim of this work is to investigate the capabilities of SAR S-1 sensors for mapping a burned forest area, performing an unsupervised image classification approach using the k-mean machine-learning algorithm. This phase was anticipated by a series of pre-processing steps aimed to 1) reducing the noise of the radar images by calculating the backscatter time average of pre and post-fire datasets; 2) calculation of radar burn difference (RBD) and radar burn ratio (RBR) indices; 3) texture feature extraction; 4) data reduction by means of principal components analysis (PCA) transformation; 5) calculation of the silhouette score to define the number of clusters. Validation of mapping was carried out using the delta normalized burn ratio ( NBR) difference index, carried out using S-2 data as a reference map. A post-processing improvement was necessary to reduce commission errors, filtering smaller clusters and ruling out from errors other small fires occurred all over the scene. The fast-and-easy k-mean algorithm produced a correct overlap of the burned area equal to 82.18%; however, several small commission errors throughout the rest of the scene persisted. The proposed methodology was tested in a study area located in the central area of Sicily (South of Italy, 37°43'N; 14°39'9E), the "Rossomanno-Grottascura-Bellia" regional nature reserve (Fig. 1) . The large part of the woodland was covered by broadleaves (genus Eucalyptus) and coniferous (mainly Pinus pinea L. and Pinus halepensis, Mill) essences. The event occurred on august 10 th 2017, and covered 38.51 km 2 , also affecting neighboring and similar forest areas outside the boundaries of the reserve. For the purpose of our study, Sentinel-1A/B ground range detected (GRD) dualpolarized (vertical-vertical VV, and vertical-horizontal VH polarizations) temporal series, acquired in interferometric wide (IW) mode, were downloaded through the Copernicus Open Access Hub (https://scihub.copernicus.eu/). In total, we acquired two S-1 image-datasets formed by eleven images for the pre-fire period (around for one month before) and five images for the post-fire period (around for one month after) (Table 1) . Besides, we downloaded two Sentinel-2A Level-1C images acquired before and after the fire, respectively, in order to obtain the reference map based on the optical vegetation index. The acquisition dates of the two optical images correspond to the first (pre-fire) and last (post-fire) dates, respectively, of S-1 time-series. Table 1 . Characteristics of Sentinel-1 and Sentinel-2 dataset used for detecting the burned study area. The red lines separate the data acquired before and after the fire for both the missions (S-1 and S-2), respectively. Sentinel-1. The S-1 data pre-processing was performed (separately for the pre and postevent datasets) and using the Sentinel-1 Toolbox implemented in the ESA-SNAP v.7.0 software (http://step.esa.int/main/toolboxes/snap/). The pre-processing started whit the application of the orbit (file automatic downloaded by SNAP), and then the thermal noise was removed. Images were radiometric calibrated and converted to gamma (γ 0 ) and sigma (σ 0 ) noughts backscatter standard conventions. Terrain correction was carried out by an autodownloaded digital elevation model (DEM) obtained from the shuttle radar topography mission (SRTM). In this step, a pixel resampling of 10 m × 10 m and a map projection to WGS84/UTM zone 33 N were performed. All images were clipped in a subset area of 3667 km 2 ( Fig. 1-b) . The same pre-processing procedures were separating applied to each time-series dataset, pre-and post-fire, respectively. Each of the two derived temporal datasets was subject to two separately layer stacking. Due to the speckle noise, using single backscattering images may not provide clear discrimination of the burned area, whereas the use of standard speckle reduction filters may instead reduce useful information [24] [25] [26] . For this reason, in order to reduce speckle noise effects, the backscatter time average was calculated separately for the two-time data series (before and after the fire), for each calibration nought (γ 0 and σ 0 ) and each polarization (VH and VV) [3] . Thus, at the end of this first pre-processing step, we had the following eight layers of data output: • Post-and pre-fire average-γ 0 -VH • Post-and pre-fire average-γ 0 -VV • Post-and pre-fire average-σ 0 -VH • Post-and pre-fire average-σ 0 -VV A second layer stack was performed in order to put all these layer-bands into a unique stack. These individual layers were used to compute the RBD (Eq. 1) (the difference between pre and post-fire backscattered time average for each polarization) and the RBR (Eq. 2) (ratio of the backscattering coefficients between pre to post-fire for each polarization) [3, 27] : RBR xyn = Post-fire average xyn /Pre-fire average xyn (2) Where xy represents a specific polarization (VV or VH) and n a specific calibration nought type (γ 0 or σ 0 ). RBD and RBR were computed for each VV and VH polarization and for each γ 0 and σ 0 nought, thus forming the first eight raster index layers ( Table 2) . For each of these index layers, the GLCM (Grey Level Co-occurrence Matrix) mean and the variance texture features were computed and appended as new informative layers, thus generating a final dataset consisting of 24 images (Table 2) . Sentinel-2. The S-2 MS data were also processed in ESA-SNAP Sentinel-2 Toolbox. The Level-2A products (Bottom-of-Atmosphere) were generated from Level-1C using Sen2Cor SNAP plugin. Then, resampling to 10 m × 10 m pixel size, map projection, and clipping on the same area of S-1 data were carried out. The classic normalized burn ratio (NBR), based on the two burn sensitive bands (near-infrared NIR and short-infrared Table 2 . The twenty-four Sentinel-1 layers obtained after the pre-processing steps. The first eight are the index layers, based on the RBD and RBR indices, and computed between pre-and post-fire backscatter time average, for each calibration nought and each polarization. Layers from 9 th to 24 th represent the GLCM texture features (Mean and Variance) computed for each S-1 index layer. Calibration nought Polarization n SWIR) [28] , was computed for pre-and post-fire datasets, in order to obtain the temporal difference index ( NBR, Eq. 3): This index was used as a reference layer since it allows us to identify better the burned areas perimeter than other methods [28] . Considering of the high number of input data layers, a principal component analysis (PCA) was performed to reduce the dimension of the dataset without losing the original information [29] . An unsupervised classification of all scene, using the k-mean algorithm, was carried out on the first transformed PCA outputs that reached a high enough cumulative variance (greater than or equal to 99.95%). One of the main issues, when a clustering algorithm is going to be performed, is the setting of the number of clusters. To deal with this, the silhouette score [30] , implemented in scikit-learn library (https://scikit-learn.org) [31] , was performed (Eq. 4). It is based on the separation distance between clusters, measured by: Where i is the value of a single-pixel contained in a cluster, a is the average dissimilarity (mean distance) of i to all other objects of the same cluster, and b is the average dissimilarity to the nearest cluster that i is not a part of [30] . This coefficient measure how close each point in a cluster is to the points in neighboring clusters, for a given number of clusters, and the computation of its average results a simple method to address the choice of k-value (number of clusters) [30] . Its value can be in a range from 1 (maximum separation, best k-value) to −1 (minimum separation, worst k-value). We calculated the mean of the silhouette score for different k-values (range from 2 to 20). The errors of the burned area mapped using S-1 layers were detected, taking as reference the NBR calculated using optical data. Since a time-average was used, several small fire-events or surface-changes that occurred during that time-frame could lead to an erroneous assessment of commission errors. To reduce this issue, a post-processing improvement was carried out, applying a semi-automatic filter in order to eliminate clusters that had an area less than 0.1 km 2 . The resulting clusters were analyzed, matching them with the reference NBR map, in order to interpret the errors and justify some of them, excluding from errors clusters that contained a mean value of the NBR pixels average greater than or equal to 0.1 (conventional burned/not-burned threshold [32] ). The classification of commission errors was performed and distinguished on two levels: 1. The commission errors relating to the entire scene, represented by all the clusters outside the burned study area boundary, which were incorrectly labeled with the burned area class. 2. The commission errors relating to the study area, represented by the overestimated pixels incorrectly classified as burned. Figure 2 shows the result layers, only for the gamma (γ 0 ) backscatter convention, of the S-1 images pre-processing in both polarizations (VV and VH). Added to the result layers in sigma (σ 0 ), the resulted 24 images were reduced using a PCA transformation. It is noticeable how some images present a more well-defined area of low backscatter (black area), presumably indicative of the burned area, compared to the others. A clear difference is observed using different polarizations, with cross-polarization (VH) images presenting a sharper and more differentiated scene. The best scenes seemed to be the two VH polarized indexes (RBD and RBR), where the GLCM mean texture was extracted. This confirms the relevance of these texture features as additional useful information layers for classification. In Table 3 , several statistics parameters of the first five PCs are reported. The first five principal components (PCs) that reached a cumulative variance of 99.98% were chosen as the input layer in the classification process (Fig. 1) . Over this value, as shown in Fig. 3 , the PCs lose the importance of their information useful for the discrimination of objects on the scene. This allowed to drastically reduce the number of layers, without losing useful information and thus speeding up the computational process [29] . It is evident that for lower values of k the silhouette score, therefore the separation of the clusters, is higher. Silhouette analysis was be used in the preliminary choice of the most suitable number of clusters. A k-value of 3 was finally chosen, given that we only needed the "burned" land-use class. The rest of the scene fell into "non-burned" class. The clustering process using k-mean was very fast, with a computation time of a few minutes. The clusters resulting from the entire scene are shown in Fig. 5 visual assessment of the first result, it was evident that the burned area was clearly outlined. However, commission errors occurred, represented by small clusters across the entire scene ( Fig. 5-a) . These were composed of 46,708 clusters belonging to the class related to the study area (burned class), whit a maximum dimension range from 100 m 2 to 0.95 km 2 , and only three others exceeding 1 km 2 (max 2.12 km 2 ). Applying the semi-automatic filter, small clusters were deleted, and the number of remaining wrong-classified ones was reduced to 217 ( Fig. 5-b) , characterized by an average area of 0.26 km 2 . Some of these, however, were actually small fire-events (not subject of this study) that occurred during the period covered by the analyzed time-series. This was ascertained comparing and matching clusters with NBR map: clusters that contained a mean value of the NBR pixels average higher than or equal to 0.1 (conventional burned/not-burned threshold [32] ) were not considered as errors, in fact, this confirmed the capability of this method to detect small events using only SAR S-1 based data. The number of clusters corresponding to NBR pixels average >= 0.1, excluding study burned area, were 57 ( Fig. 5-b) . The remaining 160 clusters could be considered as the first level of commission errors, related to the general scene. Nevertheless, with their mean area of 0.24 km 2 , they represented minimal areas if compared whit study burned area ( Fig. 5-b) . However, most of these clusters (85%) maintain NBR pixels average values greater than 0. In fact, the NBR time-difference, as already mentioned, is the classic used to classify fire severity [28, 32] . However, this index exploits the SWIR reflectance increase also caused by the lower moisture content of the detected surface and higher exposure of the soil [15] . This means that, since the SAR sensor is sensitive to changes in humidity and surface roughness [21, 33] , it is possible that it detected soil tillage, irrigated fields, etc., widespread in an area with a strong agricultural vocation. In Fig. 5 -c the vectorized study burnt area delimited by the proposed approach is presented, superimposed, and compared with the reference burned area derived by NBR. The map in Fig. 5 -d shows the areas relating to the correct overlap between the detected burned area with SAR and the reference area, the omission error, and the commission (relating only to the study area) error. The total area detected was of 38.87 km 2 , of which, the 82.18% was the correct overlap, the remaining 17.82% 3) . b) The remaining class-2 clusters after eliminating the smaller ones (<0.1 km). In blue the boundary of the burned area detected; in orange, the clusters representing small fires occurred during the time-frame (corresponding to NBR >= 0.1) (not considered errors); red clusters wrongly classified (first level of commission errors). c) Comparison between study burned area detected by SAR and the reference one NBR derived. d) Area correctly detected (green); areas omitted (red); overestimated areas (yellow). (Color figure online) represented the overestimated burned area (commission). A surface corresponding to 17.09% of the reference burned area was not be detected (omission). The level of overlap is high (82.18%). It represents a good result if compared to other works using SAR data [3, 17, 22, 34, 35] , also considering the use of a straightforward machine learning algorithm [36] applied over an extensive area. The use of PCs that explain the higher variance, resulting from the transformation of a large number of information layers, has probably contributed to achieving this good result. Only relative few small areas appear to be under omission and commission errors. These lasts represent errors centralized only on the study burned area and are considered separately from the first level of commission errors related to the wrongclassified clusters out of the study burned area. This work proposed a processing approach for the rapid mapping of burned wildareas applied to a fire event in a Mediterranean protected natural site using SAR Cband Sentinel-1 based data. The approach detected most of the burned area boundary (82.18%), on a total real burned area of 38.51 km 2 . This is important for an effective and enhanced fire monitoring and management in vulnerable environments since these sensors provide independent images of cloud cover. The use of a time average reduced speckle noise without losing useful information from the images. However, all the anomalies detectable by the SAR sensor and occurred on the surface throughout the time-frame have been detected, increasing the number of commission errors. Therefore, highlighting the potential of this sensor in detecting relatively small events. The use of a large number of layers, subsequently transformed and reduced via PCA, undoubtedly increased the possibility of detecting changes on the scene. The machine learning algorithm used is one of the simplest, proving that S-1 Cband data can be trusted. The implementation of the Silhouette score proved to be an excellent method for quickly choosing the best number of clusters to use. Using more robust supervised algorithms [37] [38] [39] [40] , accuracy is expected to be improved. However, this requires field data to be used as ground-truth for training and validation phases. It could be a topic of the next investigation. S-1 SAR C-band data have proven to be a viable alternative for mapping the extent of the fire, useful in the event that optical data is not available due to cloud cover or fire smoke. A significant optimization of the method could provide for an increase in precision and the ability to capture even small events better. Earth Observation of Wildland Fires in Mediterranean Ecosystems Identification of burned areas and severity Historical background and current developments for mapping burned area from satellite earth observation A spatio-temporal active-fire clustering approach for global burned area mapping at 250 m from MODIS data Exploitation of copernicus sentinels data for sensing fire-disturbed vegetated areas Using Landsat 8 imagery in detecting cork oak (Quercus suber L.) woodlands: a case study in Calabria (Italy) Application, validation and comparison in different geographical contexts of an integrated model for the design of ecological networks Evolution trends of land use/land cover in a mediterranean forest landscape in Italy An index for the assessment of degraded mediterranean forest ecosystems The fourth regime of open space Photogrammetry and remote sensing for the identification and characterization of trees in urban areas Using sentinel 1-SAR for monitoring long term variation in burnt forest areas Burned area detection and mapping using Sentinel-1 backscatter coefficient and thermal anomalies Sensitivity of X-, C-, and L-band SAR backscatter to burn severity in Mediterranean pine forests Burned area detection and mapping: intercomparison of sentinel-1 and sentinel-2 based algorithms over tropical Africa Applicability of the multitemporal coherence approach to sentinel-1 for the detection and delineation of burnt areas in the context of the copernicus emergency management service The potential of multifrequency SAR images for estimating forest biomass in mediterranean areas Remote Sensing and Image Interpretation, 7th edn Effect of the vegetation fire on backscattering: an investigation based on sentinel-1 observations Integration of optical and SAR data for burned area mapping in mediterranean regions Sentinel-1 based algorithm to detect burned areas Land Surface Remote Sensing in Agriculture and Forest Remote Sensing with Imaging Radar Effects and performance of Speckle noise reduction filters on active Radar and SAR images Performance of various speckle reduction filters on synthetic aperture radar image Radar burn ratio for fire severity estimation at canopy level: an example for temperate forests Multiscale mapping of burn area and severity using multisensor satellite data and spatial autocorrelation analysis Remote Sensing Digital Image Analysis, 5th edn Silhouettes: a graphical aid to the interpretation and validation of cluster analysis Scikit-learn: machine learning in python Fire intensity, fire severity and burn severity: a brief review and suggested usage Combining machine learning and compact polarimetry for estimating soil moisture from C-band SAR data Identification of burnt areas in mediterranean forest environments from ERS-2 SAR time series Mapping burn scars, fire severity and soil erosion susceptibility in southern France using multisensoral satellite data Evaluating supervised and unsupervised techniques for land cover mapping using remote sensing data Comparison of machine learning methods applied to SAR images for forest classification in mediterranean areas Developments in landsat land cover classification methods: a review. Remote Sens A review of supervised object-based land-cover image classification Machine-learning applications for the retrieval of forest biomass from airborne P-Band SAR data Acknowledgments. Giandomenico De Luca was supported by the European Commission through the European Social Fund (ESF) and the Regione Calabria.