andurinha

Noemi Álvarez Fernández, Antonio Martínez Cortizas

2020-08-07

This packages provides tools to make spectroscopic data processing easier and faster. It allows to find and select peaks based on the second derivative or absorbance sum spectrum. Furthermore, it supplies functions for graphic support, which makes the workflow more user friendly.

library(andurinha)

Import

There are two common situations when importing the spectroscopic data.

  1. In the first case, data may be in the same file with the structure:

    • First column: wave numbers.

    • The following columns: the samples absorbances.

    In that case, import the data with the most suitable function.

  2. In the second case, data may be in separated files with the structure:

    • First column: wave numbers.

    • Second column: sample absorbance.

In that case, the function importSpectra() can be used; to do so the files extension should be .csv and they must be in the same directory - this folder must not contain any other file.

spectra <- importSpectra(path = tempdir(), sep = ";")
head(spectra)

#>    WN     A     B     C
#> 1 399 0.011 0.008 0.009
#> 2 401 0.008 0.006 0.006
#> 3 403 0.006 0.005 0.006
#> 4 405 0.005 0.005 0.005
#> 5 407 0.005 0.005 0.003
#> 6 409 0.003 0.004 0.002

Find peaks

The function findPeaks() verifies the spectra quality, finds peaks (surprise!) and allows to select the most relevant ones based on the absorbance or second derivative sum spectrum. To use it the data must be in the appropriate format, it means that the object class must be a data frame with the structure:

This function has five arguments:

  1. resolution: the equipment measurement resolution, which is by default 4 cm-1.

  2. minAbs: the cut off value to check spectra quality, which is by default a spectrum absorbance maximum of 0.1.

  3. cutOff: the second derivative or absorbance sum spectrum cut off to reduce the raw peaks table, which is by default NULL.

  4. scale: by default is TRUE and data is scaled as Z-scores. FALSE should be used in case you do not want to scale it.

  5. ndd: by default is TRUE and peaks are searched based on the second derivative sum spectrum. FALSE should be used in case you want to search them based on absorbance sum spectrum.

This function - with all the arguments by default - returns a list with four data frames:

  1. dataZ: the standardised data by Z-scores.

  2. secondDerivative: the second derivative values of the data.

  3. sumSpectrum_peaksTable: the peaks wave numbers and their second derivative or absorbance sum spectrum values.

  4. peaksTable: the peaks wave numbers and their absorbance for each spectrum.

By default, if there is any spectrum with a maximum absorbance lower than 0.1 a warning will be returned; in case this shows up and you want to continue, you should modify the minAbs value. Once the quality control has been passed, by default the data is scaled - to skip it use scale = FASLE - the next steps will depend on the selected method for finding peaks:

  1. Absorbance sum spectrum: in this case the absorbance sum spectrum is calculated and the peaks are searched based on it.

  2. Second derivative sum spectrum: in this case the second derivative of the absorbance data is calculated and then the peaks are searched based on the sum spectrum.

# Search peaks based on absorbance sum spectrum
# with standarised absorbance data
fp.abs <- findPeaks(andurinhaData, ndd = FALSE)
summary(fp.abs)
#>                        Length Class      Mode
#> dataZ                  4      data.frame list
#> sumSpectrum_peaksTable 2      data.frame list
#> peaksTable             4      data.frame list
dim(fp.abs$sumSpectrum_peaksTable)
#> [1] 34  2

# Search peaks based on second derivative sum spectrum
# with standarised absorbance data
fp.ndd <- findPeaks(andurinhaData)
summary(fp.ndd)
#>                        Length Class      Mode
#> dataZ                  4      data.frame list
#> secondDerivative       4      data.frame list
#> sumSpectrum_peaksTable 2      data.frame list
#> peaksTable             4      data.frame list
dim(fp.ndd$sumSpectrum_peaksTable)
#> [1] 220   2

# Search peaks based on second derivative sum spectrum
# with no standarised absorbance data
fp.nZs <- findPeaks(andurinhaData, scale = FALSE)
summary(fp.nZs)
#>                        Length Class      Mode
#> secondDerivative       4      data.frame list
#> sumSpectrum_peaksTable 2      data.frame list
#> peaksTable             4      data.frame list
dim(fp.nZs$sumSpectrum_peaksTable)
#> [1] 219   2

Visualisation

To visualised both the raw data and the processed data by findPeaks(); the functions gOverview() and plotPeaks() may be applied.

gOverview():

Gives a graphic summary of the data. This function has the arguments:

  1. data_abs: to provide a data frame with the absorbance data. The structure should be: wave numbers in the first column and samples absorbance in the following columns.

  2. data_ndd: to provide a data frame with the second derivative data. The structure should be: wave numbers in the first column and samples second derivative values in the following columns.

  3. fontFamily: to change the plot font.

# Graphic overview of the raw data
gOverview(andurinhaData)