Augmentation
Data augmentation is a mathematical transformation of the spectral data that adds stochastic artifacts that resemble the ones that can be found in real-world data. The objective of data augmentation is to increase the size of the dataset and to improve the generalization of the model. The following algorithms are available:
- Augmentation with normal noise
- Augmentation with uniform noise
- Augmentation with exponential noise
- Baseline shift
- Peak shift
- Scale spectrum
Augmentation with normal noise
Augmentation with normal noise to the spectrum. Gaussian noise with mean 0 and standard deviation defined by the user is added to each data-point of the spectrum.
Arguments
Argument | Description | Type | Default |
---|---|---|---|
scale | Standard deviation of the normal distribution. | float | 0.0 |
random_state | Seed for the random number generator. | int | None |
Usage Example
from chemotools.augmentation import NormalNoise
normal_noise = NormalNoise(scale=0.001)
augmented_spectra = normal_noise.fit_transform(spectra)
Plotting Example
Augmentation with uniform noise
Augmentation with uniform noise to the spectrum. Uniform noise with minimum and maximum values defined by the user is added to each data-point of the spectrum.
Arguments
Argument | Description | Type | Default |
---|---|---|---|
min | Minimum value of the uniform distribution. | float | 0.0 |
max | Maximum value of the uniform distribution. | float | 0.0 |
random_state | Seed for the random number generator. | int | None |
Usage Example
from chemotools.augmentation import UniformNoise
uniform_noise = UniformNoise(min=-0.001, max=0.001)
augmented_spectra = uniform_noise.fit_transform(spectra)
Plotting Example
Augmentation with exponential noise
Augmentation of the spectra by adding noise following an exponential distribution with a given standard distribution.
Arguments
Argument | Description | Type | Default |
---|---|---|---|
scale | Standard deviation of the exponential distribution. | float | 0.0 |
random_state | Seed for the random number generator. | int | None |
Usage Example
from chemotools.augmentation import ExponentialNoise
exponential_noise = ExponentialNoise(scale=0.001)
augmented_spectra = exponential_noise.fit_transform(spectra)
Plotting Example
Augmentation with baseline shift
Adds a baseline to the data. The baseline is drawn from a one-sided uniform distribution between 0 and 0 + scale.
Arguments
Argument | Description | Type | Default |
---|---|---|---|
scale | Baseline to add. The baseline will be drawn from a unifrom distribution between 0 and 0 + scale | float | 0.0 |
random_state | Seed for the random number generator. | int | None |
Usage Example
from chemotools.augmentation import BaselineShift
baseline = BaselineShift(scale=0.05)
augmented_spectra = baseline.fit_transform(spectra)
Plotting Example
Augmentation with peak shift noise
Augmentation of the spectra by shifting the peak a defined number of indices along the x-axis. This augmentation technique is specially interesting in Raman spectra, where peak shifts between or within instruments can occur as result of differences in the gratings.
Arguments
Argument | Description | Type | Default |
---|---|---|---|
shift | Number of indices to the left and right that the spectra will be shifted. The method will select a random value between -shift to +shift following a uniform distribution. | float | 0.0 |
random_state | Seed for the random number generator. | int | None |
Usage Example
from chemotools.augmentation import IndexShift
shift = IndexShift(shift=2)
augmented_spectra = shift.fit_transform(spectra)
Plotting Example
Augmentation with peak scaling
Scales the data by a value drawn from the uniform distribution centered around 1.0 with
Arguments
Argument | Description | Type | Default |
---|---|---|---|
scale | Range of the uniform distribution to draw the scaling factor from 1 - shift to 1 + shift following a uniform distribution. | float | 0.0 |
random_state | Seed for the random number generator. | int | None |
Usage Example
from chemotools.augmentation import SpectrumScale
scale = SpectrumScale(scale=0.01)
augmented_spectra = scale.fit_transform(spectra)