3 Preprocessing

3.1 Normalization

Before any spectral indices and properties are calculated, normalizing the data and expressing it as a reflectance is necessary. Here, reflectance is a fraction of the signal between the dark and white references acquired during or before the scan.

Normalization is achieved by following the equation:

\[ ref = (d_{raw} - ref_{d})/(ref_{w} - ref_{d}) \]

where:

ref: normalized data cube;

d_raw: raw captured data;

ref_d: dark reference data;

ref_w: white reference data.

Which can be modified for different acquisition times for white reference:

\[ ref = (d_{raw} - ref_{dar})/(ref_{w} - ref_{d}) * t_{int-w}/t_{int-s} \]

where:

t_int-w: integration time of white reference;

t_int-s: integration time of captured data (sample).

3.1.1 Normalization with Shiny output

If Shiny GUI was used for data selection, cropping, and calibration, then it would be easy to pass Shiny’s output to the normalization routine. The normalized files will be written into your data’s products directory.

First, this is specific only for workflow, where you’d like to calculate reflectance from the Shiny output. We must set the working directory (otherwise, we discourage this approach in R). Working directory should point two levels above your /capture data directory.

For example, if you are working with “YOUR CORE” then working directory should point one level above it, and therefore two levels above the capture.

– WORKING DIRECTORY<directory>

–– YOUR CORE <directory>

––– capture <directory>

# Set working direcotory
1setwd("C:/GitHub/data/cake/")

1: Path to your working directory. In cake folder we have two different cores.

You either have the HSItools::run_core() output in memory or you need to read it from disk.

# Read from disk
hsi_tools_core <- readRDS("Achive_half_WR_240212-111356/HSItools_core.rds")

Then, use the standard function to calculate the reflectance.

# Create normalized reflectance file
1reflectance <- hsi_tools_core |>
2  HSItools::prepare_core()

1: The Shiny GUI output.
2: Normalization function.

If you used different integration times for your sample and white references, provide this information in the function.

# Create normalized reflectance file
1reflectance <- hsi_tools_core |>
2  HSItools::prepare_core(
3    tints = 90,
4    tintw = 30)

1: The Shiny GUI output.
2: Normalization function.
3: Integration time of sample (captured data).
4: Integration time of white reference.

Multiple cores

It is planned to optionally iterate over multiple directories using, for example, the {purrr} package. This will need to get rid of setwd() call.

3.1.2 Normalization of the directory (no Shiny output)

If no Shiny output is available and input is not produced by hand, the normalization routine can be run without it. In such a case, the entire capture data will be normalized. However, without Shiny output, it is harder to calibrate distances properly.

Here, we normalize all the captured data with all the available layers between 400 and 1000 nm.

# Store the path
path <- "data/my_core"

# Create normalized reflectance file
1reflectance <- path |>
2  HSItools::prepare_core(path = path, layers = c(400:1000), extent = "capture")

1: Path to your core of choice, root directory. Don’t go to the “capture” directory.
2: Normalization function. The path is your core path, then the selection of layers (bands) and an explicit call informing that the entire capture extent should be normalized.

It is possible to iterate over multiple directories at once using the {purrr} package.

# Store the paths
1paths <- list(
  core_1 = "data/my_core_1",
  core_2 = "data/my_core_2")

# Or easier, with listing
2paths <- fs::dir_ls("data")

# Create normalized reflectance files
3reflectance <- paths |>
  purrr::map(\(i) HSItools::prepare_core(
    path = i,
    layers = c(400:1000),
4    extent = "capture"))

1: Create a list of directories by hand.
2: Or list root directories in your higher-level directory.
3: Your paths.
4: Normalization function iterated over multiple directories.

3.2 Savitzky-Golay smoothing

Applying spectral smoothing, such as the Savitzky-Golay spectral filter (Savitzky and Golay 1964), prevents spurious, random peaks and troughs from significantly influencing the calculation results.

1reflectance <- reflectance |>
  # Specify the file extension
2  HSItools::filter_savgol()

1: Reflectance data is either in memory or from a disk.
2: Calculation function.

Speed

This function applies a smooth filter to every pixel. Depending on the size of your data, it will take considerable time to run.

3.3 Median filter

You might be interested in smoothing your data spatially to remove some of the noise in the XY domain. Data is saved to disk.

1reflectance <- reflectance |>
  # Specify the file extension
2  HSItools::filter_median()

1: Reflectance data is either in memory or from a disk.
2: Calculation function. You can set the window (defaults to 3 by 3).

Speed

This function applies a focal median filter to every pixel. Depending on the size of your data, it will take considerable time to run. More than the Savitzky-Golay. Consider running it on the calculated index rasters and ROIs.

3.4 Continuum removal

We can process our data further by removing the continuum, which follows the rule of dividing the spectrum by its bounding box. This way, the spectrum becomes flatter.

ToDo

While function technically works, it is not producing desired output for HSI means.

1reflectance_cr <- reflectance |>
2  HSItools::remove_continuum()

1: Reflectance data is either in memory or from a disk.
2: Calculation function.

3.5 Preview

For later use, generate a RGB (CIR, NIR) preview.

1rgb <- reflectance |>
2  HSItools::stretch_raster_full(
3    type = "RGB",
4    extension = "tif")

1: Reflectance data is either in memory or from a disk.
2: Stretching function
3: with type set to RGB (also accepts CIR and NIR).
4: And an extension, works with png, too.