Code
from datetime import datetime, timedelta
@@ -337,7 +337,7 @@
1.1.1 Searching in the Catalog
The module odc-stac
provides access to free, open source satelite data. To retrieve the data, we must define several parameters that specify the location and time period for the satellite data. Additionally, we must specify the data collection we wish to access, as multiple collections are available. In this example, we will use multispectral imagery from the Sentinel-2 satellite.
-
+
Code
= 0.0006 # 60m resolution
@@ -390,7 +390,7 @@ dx 1.1.2 Loading the Data
Now we will load the data directly into an xarray
dataset, which we can use to perform computations on the data. xarray
is a powerful library for working with multi-dimensional arrays, making it well-suited for handling satellite data.
Here’s how we can load the data using odc-stac and xarray:
-
+
Code
# define a geobox for my region
@@ -417,7 +417,7 @@ 1.2.1 RGB Image
With the image data now in our possession, we can proceed with computations and visualizations.
First, we define a mask to exclude cloud cover and areas with missing data. Subsequently, we create a composite median image, where each pixel value represents the median value across all the scenes we have identified. This approach helps to eliminate clouds and outliers present in some of the images, thereby providing a clearer and more representative visualization of the scene.
-
+
Code
# define a mask for valid pixels (non-cloud)
@@ -460,7 +460,7 @@
1.2.2 False Color Image
In addition to the regular RGB Image, we can swap any of the bands from the visible spectrum with any other bands. In this specific case the red band has been changed to the near infrared band. This allows us to see vegetated areas more clearly, since they now appear in a bright red color. This is due to the fact that plants absorb regular red light while reflecting near infrared light (NASA 2020).
-
+
Code
# compute the false color image
@@ -502,7 +502,7 @@ 0.33 to 0.66 are moderatly healthy plants
0.66 to 1 are very healthy plants
-
+
Code
# Normalized Difference Vegetation Index (NDVI)
@@ -529,7 +529,7 @@
1.3.1 Regions of Interest
Since this is a supervised classification, we need to have some training data. Therefore we need to define areas or regions, which we are certain represent the feature which we are classifiying. In this case we are interested in forested areas and regions that are definitly not forested. These regions will be used to train our classifiers.
-
+
Code
# Define Polygons
@@ -581,7 +581,7 @@
1.3.2 Data Preparation
In addition to the Regions of Interest we will extract the specific bands from the loaded dataset that we intend to use for the classification, which are the red, green, blue
and near-infrared
bands, although other bands can also be utilized. Using these bands, we will create both a training and a testing dataset. The training dataset will be used to train the classifier, while the testing dataset will be employed to evaluate its performance.
-
+
Code
# Classifiying dataset (only necessary bands)
@@ -628,7 +628,7 @@
Now that we have prepared the training and testing data, we will create an image array of the actual scene that we intend to classify. This array will serve as the input for our classification algorithms, allowing us to apply the trained classifiers to the entire scene and identify the forested and non-forested areas accurately.
-
+
Code
= ds_class[bands].to_array(dim='band').transpose('latitude', 'longitude', 'band')
@@ -644,7 +644,7 @@ image_data 1.3.3 Classifiying with Naive Bayes
Now that we have prepared all the needed data, we can begin the actual classification process.
We will start with a Naive Bayes classifier. First, we will train the classifier using our training dataset. Once trained, we will apply the classifier to the actual image to identify the forested and non-forested areas.
-
+
Code
# Naive Bayes initialization and training
@@ -661,7 +661,7 @@
+
Code
# Plot Naive Bayes
@@ -735,7 +735,7 @@
1.3.4 Classifiying with Random Forest
To ensure our results are robust, we will explore an additional classifier. In this section, we will use the Random Forest classifier. The procedure for using this classifier is the same as before: we will train the classifier using our training dataset and then apply it to the actual image to classify the scene.
-
+
Code
# Random Forest initialization and training
@@ -777,7 +777,7 @@
Actual Negative
-6304
-314
+6294
+324
Actual Positive
-284
-5203
+287
+5200
@@ -818,7 +818,7 @@
1.3.5 Comparison of the Classificators
To gain a more in-depth understanding of the classifiers’ performance, we will compare their results. Specifically, we will identify the areas where both classifiers agree and the areas where they disagree. This comparison will provide valuable insights into the strengths and weaknesses of each classifier, allowing us to better assess their effectiveness in identifying forested and non-forested regions.
-
+
Code
= colors.ListedColormap(['whitesmoke' ,'indianred', 'goldenrod', 'darkgreen'])
@@ -845,7 +845,7 @@ cmap_trio
+
Code
# Plot only one class, either None (0), Naive Bayes (1), Random Forest (2), or Both (3)
@@ -872,7 +872,7 @@
+
Code
= {}
diff --git a/chapters/01_classification_files/figure-html/cell-13-output-1.png b/chapters/01_classification_files/figure-html/cell-13-output-1.png
index 20f5171..b29e99f 100644
Binary files a/chapters/01_classification_files/figure-html/cell-13-output-1.png and b/chapters/01_classification_files/figure-html/cell-13-output-1.png differ
diff --git a/chapters/01_classification_files/figure-html/cell-14-output-1.png b/chapters/01_classification_files/figure-html/cell-14-output-1.png
index 14f8dd2..479b634 100644
Binary files a/chapters/01_classification_files/figure-html/cell-14-output-1.png and b/chapters/01_classification_files/figure-html/cell-14-output-1.png differ
diff --git a/chapters/01_classification_files/figure-html/cell-15-output-1.png b/chapters/01_classification_files/figure-html/cell-15-output-1.png
index dae0674..eab5b26 100644
Binary files a/chapters/01_classification_files/figure-html/cell-15-output-1.png and b/chapters/01_classification_files/figure-html/cell-15-output-1.png differ
diff --git a/chapters/01_classification_files/figure-html/cell-16-output-1.png b/chapters/01_classification_files/figure-html/cell-16-output-1.png
index 39a46e8..d07cf77 100644
Binary files a/chapters/01_classification_files/figure-html/cell-16-output-1.png and b/chapters/01_classification_files/figure-html/cell-16-output-1.png differ
diff --git a/chapters/02_floodmapping.html b/chapters/02_floodmapping.html
index 1c16e68..ab8b446 100644
--- a/chapters/02_floodmapping.html
+++ b/chapters/02_floodmapping.html
@@ -289,7 +289,7 @@ counts Table of contents
Image from wikipedia
-
+
%matplotlib widget
import numpy as np
@@ -301,7 +301,7 @@ Table of contents
from scipy.stats import norm
from eomaps import Maps
-
+
= xr.open_dataset('../data/s1_parameters/S1_CSAR_IWGRDH/SIG0/V1M1R1/EQUI7_EU020M/E054N006T3/SIG0_20180228T043908__VV_D080_E054N006T3_EU020M_V1M1R1_S1AIWGRDH_TUWIEN.nc') sig0_dc
@@ -331,7 +331,7 @@
@@ -501,7 +501,7 @@
@@ -516,7 +516,7 @@ \(\sigma^0\). These so-called posteriors need one more piece of information, as can be seen in the equation above. We need the probability that a pixel is flooded \(P(F)\) or not flooded \(P(NF)\). Of course, these are the figures we’ve been trying to find this whole time. We don’t actually have them yet, so what can we do? In Bayesian statistics, we can just start with our best guess. These guesses are called our “priors”, because they are the beliefs we hold prior to looking at the data. This subjective prior belief is the foundation Bayesian statistics, and we use the likelihoods we just calculated to update our belief in this particular hypothesis. This updated belief is called the “posterior”.
Let’s say that our best estimate for the chance of flooding versus non-flooding of a pixel is 50-50: a coin flip. We now can also calculate the probability of backscattering \(P(\sigma^0)\), as the weighted average of the water and land likelihoods, ensuring that our posteriors range between 0 to 1.
The following code block shows how we calculate the priors.
-
+
def calc_posteriors(water_likelihood, land_likelihood):
= (water_likelihood * 0.5) + (land_likelihood * 0.5)
evidence return (water_likelihood * 0.5) / evidence, (land_likelihood * 0.5) / evidence
@@ -528,7 +528,7 @@
@@ -541,7 +541,7 @@
2.5 Flood Classification
We are now ready to combine all this information and classify the pixels according to the probability of flooding given the backscatter value of each pixel. Here we just look whether the probability of flooding is higher than non-flooding:
-
+
def bayesian_flood_decision(id, sig0_dc):
= calc_posteriors(*calc_likelihoods(id, sig0_dc))
nf_post_prob, f_post_prob return np.greater(f_post_prob, nf_post_prob)
@@ -553,7 +553,7 @@
@@ -576,7 +576,7 @@