License: All the original materials used in this workshop are available under CC-BY-SA.

Overview

Reasons

There are several packages in R dealing with satellite images (e.g., MODISTools or satellite). So, why developing a new package?

  • Centralize: Three major sources of freely accessible satellite images; Landsat, MODIS, and Sentinel. Their combination increases the spatio-temporal resolution of remotely sensed information (e.g., data fusion techniques as in Belgiu and Stein (2019)). A single access point can foster multi-program analyses.

  • Standardize: Satellite programs differ in their data standards and formats (compression formats, naming conventions, bands, tiling systems, etc.). Web services offer different sharing protocols (GUIs and APIs) and data products. Programming can deal with this complexity in the background and offer a consistent experience to the user.

  • Automate: The analysis of the earth’s surface patterns and dynamics frequently requires long series of images. Also, data-sets must undergo multiple transformations from acquisition to analysis. We found convenient to develop a comprehensive package that facilitates the use of computationally efficient routines.

Workflow

There is a large number of functions in RGISTools. Functions are named in a systematic way to easily find the function best suited for an specific satellite program and purpose:

  1. The prefix of the function’s name indicates the data source:
  • Landsat program:
    • ls: Functions devoted to Landsat imagery.
      • ls7: for Landsat-7.
      • ls8: for Landsat-8.
  • MODIS program:
    • mod: Functions devoted to MODIS imagery.
  • Copernicus program:
    • sen: Functions devoted to Sentinel(-2) imagery.
  1. The suffix reflects the functionality:
  • Retrieve: getting data
    • Search: find imagery for a given ROI and time-span.
    • Preview: inspect the overlap with the ROI and cloud coverage.
    • Download: acquire images massively.
  • Customize: manufacturing data
    • Mosaic: combine tiles/scenes and clip the extent of the ROI.
    • FolderToVar: calculate indexes, such as NDVI.
  • Process: cleaning data
    • CloudMask: remove the values of cloudy pixels.
    • Composition: generate regular series and improve image quality (e.g., Wolfe, Roy, and Vermote (1998)).
    • SmoothingIMA: gap-filling and smoothing data to resolve missing values and outliers (Militino et al. 2019).

For instance, the function that downloads Sentinel images is called senDownload(). There are other functions in the package not mentioned in this list. Right now, the package is focused on multispectral images.


Demo

Today’s demo:

  • Analyze change-points patterns in a time-series of images of the normalized difference vegetation index (NDVI) - an indicator of vegetation greenness - and the land surface temperatures (LST).
  • The analysis concerns Castellón (Spain) and Nordrhein-Westfalen (Germany).
  • The time period of analysis covers 2001-2018.

In this first session:

  • Download and customize (a sample of) the data-set for the analysis.
  • Walk-through with Castellón and NDVI.
  • Exercises with Nordrhein-Westfalen and LST.

Set-outs

The package

Install the latest version of the package from github, and load it:

library(devtools)
install_github("spatialstatisticsupna/RGISTools")
library(RGISTools)

Accounts

The package interacts with the EarthData and SciHub web services. Web services provide access to archives of satellite images. They require a profile that you can create here and here. For this workshop, use the following credentials:

username <- "geomundus2019"
password <- "Geomundus2019"

(The profile will be deleted after the workshop. Make sure to register)

Region of interest (ROI)

Since we are dealing with administrative divisions, we use the database of Global Administrative Division Maps provided by raster (Hijmans 2019). The following code downloads the shape-file of Spain and its provinces:

spain <- getData('GADM', country = "Spain", level = 2)

From there, we select the region of Castellón:

castellon <- subset(spain, NAME_2 == "Castellón")

We can display the results using tmap (Tennekes 2018). The package is loaded with RGISTools and used for visualization. The tmap package supports many spatial data formats, is flexible, and allows interactive and static maps with the same code.

We set the option to display interactive maps:

tmap_mode("view")
## tmap mode set to interactive viewing

We display our SpatialPolygonDataFrame of Castellon on a map:

tmap_leaflet(tm_shape(castellon) +
  tm_polygons(col = "red") +
  tm_view(set.view = 4))

Exercise

  1. Download the shape-file of Germany.
  2. Filter the shape-file to get the region of Nordrhein-Westfalen.

Replace the <> code conveniently:

germany <- getData('GADM', country = <COUNTRYNAME>, level = 2)
nordrhein <- subset(germany, NAME_1 == <REGIONAME>)

tmap_leaflet(tm_shape(nordrhein) +
  tm_polygons(col = "red") +
  tm_view(set.view = 4))

Retrieve

Exercise

  1. Search the available images of the LST product (MOD11A2) intersecting the Nordrhein-Westfalen region (nordrhein variable) during the same time period as before.
  2. Answer the following questions:
(a) How many images did you find?
(b) How many tiles intersect with Nordrhein-Westfalen?
(c) What is the temporal frequency? 

Replace the <> code conveniently:

sres.lst <- modSearch(product = <PRODUCT>,
                      region = <REGION>,
                      dates = <DATES>)

tileid.lst <- <CODE>
sres.dates <- <CODE>

Download

RGISTools works with satellite images locally on your computer (images stored on disk), which has two implications:

  1. Memory issues:
  • RAM: Images are handled externally to R (GDAL via sf, Pebesma (2018) whenever possible). The series of images are loaded at the end, when the data-set contains only useful information for the analysis.
  • Disk: strategies via arguments (see below).
  1. File management issues: a clear hierarchy of files and folder is key.

The function modDownload() requires (at least):

  1. The search list (searchres argument) from Earth Data (sres.ndvi).
  2. The directory (AppRoot argument) where the images are placed.
  3. The credentials of an Earth Data account (username and password arguments).
  4. An instruction whether to convert (extract.tif = TRUE) the bands from the original files (in Hierarchical Data Format or HDF) into GeoTiff.

As an example, we download the images for the first day (4 tiles) in a new directory ./exercises/ndvi:

wdir.ndvi <- file.path("./", "exercises", "ndvi")
modDownload(searchres = sres.ndvi[1:4],
            AppRoot = wdir.ndvi,
            username = "geomundus2019",
            password = "Geomundus2019",
            extract.tif = TRUE)

The function creates two new folders:

  • ./exercises/ndvi/hdf: contains the HDF files.
  • ./exercises/ndvi/tif: contains the extracted bands, being each band a separate GeoTiff.

These 4 images take around 51 MB and 150 MB with the HDF and GeoTiff files! To be more efficient, we can specify:

  • bFilter: converts into GeoTiffs only the relevant bands.
  • raw.rm: removes the raw files once the GeoTiffs are created.

If we remove the tif folder:

wdir.ndvi.tif <- file.path(wdir.ndvi, "tif")
unlink(wdir.ndvi.tif, recursive = TRUE)

And we run the function adding bFilter = "NDVI" to select only the “NDVI” band and remove the original files by setting raw.rm = TRUE:

modDownload(searchres = sres.ndvi[1:4],
            AppRoot = wdir.ndvi,
            username = "geomundus2019",
            password = "Geomundus2019",
            extract.tif = TRUE,
            bFilter = "NDVI",
            raw.rm = TRUE)

Then, the folders:

  • ./exercises/ndvi/hdf: has no image.
  • ./exercises/ndvi/tif: contains one band.

We went from 201 MB to 11 MB. Note that the second time, the function runs faster. modDownload() detects that the original files are there and therefore, only converts the bands to GeoTiffs.

Exercise

  1. Create a new directory for the LST images.
  2. Run the function to download the first LST image.
  3. Remember to filter the bands that are extracted ("LST_Day_1km") and indicate raw.rm = TRUE.

Replace the <> code conveniently:

wdir.lst <- <LSTDIRECTORY>
modDownload(<ARGUMENTS>)

Customize

Mosaicking and cropping

Mosaicking means joining images captured on the same date but belonging to different tiles. Cropping means removing the pixels outside the bounding box of the ROI. Both tasks are carried out by modMosaic(), which requires:

  1. The folder directory (src argument) that stores the GeoTiffs.
  2. If needed, a ROI (region argument) around which images are cropped.
  3. The name of the folder (out.name) where the results will be saved.
  4. The directory (AppRoot argument) defined with the argument.
modMosaic(src = wdir.ndvi.tif,
          region = castellon,
          out.name = "castellon",
          AppRoot = wdir.ndvi)

You will notice the new folder that has been added to the current working directory. Now, the folders are:

  • ./exercises/ndvi/hdf: has no image.
  • ./exercises/ndvi/tif: contains one band.
  • ./exercises/ndvi/castellon: contains the mosaicked image.

Thanks to cropping, now the size of memory storage consumed by our data-set is roughly 0.36 MB.

Exercise

  1. Define the directory where your LST GeoTiff image is saved.
  2. Mosaic and crop the image around Nordrhein-Westfalen (nordhrein variable).

Replace the <> code conveniently:

wdir.lst.tif <- <LSTDIRECTORYTIF>
modMosaic(src = wdir.lst.tif,
          <OTHERARGUMENTS>,
          AppRoot = <LSTDIRECTORY>)

Import and visualize

We can import the imagery using the stack() function from the raster package (Hijmans 2019) as follows:

wdir.ndvi.mos <- file.path(wdir.ndvi, "castellon")
files.ndvi <- list.files(wdir.ndvi.mos, full.names = TRUE, recursive = TRUE)
imgs.ndvi <- stack(files.ndvi)

It is necessary to re-scale the image to the [-1,1] NDVI range. Information about the scaling parameter can be found in the product catalogue. There, we find that the re-scaling factor is 0.0001:

imgs.ndvi <- imgs.ndvi * 0.0001

RGISTools has the genPlotGIS() function to visualize your data. It is a wrapper function of tmap (Tennekes 2018), configured to rapidly display the spatial data usually handled within the package. The basic inputs are:

  1. A RasterStack of satellite images (r argument).
  2. The ROI (region argument).
tmap_mode("view")
tmap_leaflet(
  genPlotGIS(r = imgs.ndvi,
            region = castellon,
            tm.raster.r.title = "NDVI",
            tm.raster.r.alpha = 0.35), add.titles = FALSE)

Exercise

  1. Import your LST image.
  2. Find out whether the image must to be re-scaled and, if so, define and apply the scaling factor. hint.
  3. Show your results on a map.

Replace the <> code conveniently:

wdir.lst <- <LSTDIRECTORYMOSAIC>
files.lst <- <FILES>
imgs.lst <- stack(files.lst)

scale.factor <- <SCALEFACTOR> # no re-scaling, set to 1 
imgs.lst <- imgs.lst * scale.factor
genPlotGIS(<CODE>)

Concluding remarks

  • The package is a way to access, customize, and process time-series of satellite imagery from several platforms.
  • As an introduction, we just downloaded and customized LST and NDVI images from MODIS for two locations: Castellon (Spain) and Nordhrein-Westfalen (Germany).
  • Being memory efficient is key to work locally with satellite images. The package offers several strategies (band filtering, mosaicking, GDAL, etc.).
  • It is crucial to develop a clear hierarchy for the files and folders created along the working steps.
  • The walk-through and exercises focused on MODIS, but multispectral data from other satellite programs can be similarly retrieved.
  • The walk-through and exercises cover the download and customization, and does not cover the processing (cloud masking, compositing, and filling-smoothing).
  • Stay tuned for new vignettes at: https://github.com/spatialstatisticsupna/RGISTools

References

Belgiu, Mariana, and Alfred Stein. 2019. “Spatiotemporal image fusion in remote sensing.” Remote sensing 11 (7): 818. https://doi.org/10.3390/rs11070818.

Hijmans, Robert J. 2019. Raster: Geographic Data Analysis and Modeling. https://CRAN.R-project.org/package=raster.

Militino, Ana F, M Dolores Ugarte, Unai Pérez-Goya, and Marc G Genton. 2019. “Interpolation of the Mean Anomalies for Cloud Filling in Land Surface Temperature and Normalized Difference Vegetation Index.” IEEE Transactions on Geoscience and Remote Sensing 57 (8): 6068–78. https://doi.org/10.1109/TGRS.2019.2904193.

Pebesma, Edzer. 2018. “Simple Features for R: Standardized Support for Spatial Vector Data.” The R Journal 10 (1): 439–46. https://doi.org/10.32614/RJ-2018-009.

Tennekes, Martijn. 2018. “tmap: Thematic Maps in R.” Journal of Statistical Software 84 (6): 1–39. https://doi.org/10.18637/jss.v084.i06.

Wolfe, Robert E, David P Roy, and Eric Vermote. 1998. “MODIS Land Data Storage, Gridding, and Compositing Methodology: Level 2 Grid.” IEEE Transactions on Geoscience and Remote Sensing 36 (4): 1324–38. https://doi.org/10.1109/36.701082.