Learning | Data Perturbation
Data Perturbation is an algorithm that adds random noise to the weight of each observation in the database. The additive noise is generated from a normal distribution with 0 mean and a standard deviation to be set by the user. A Decay Factor can be set to progressively attenuate the standard deviation with each iteration.
Data Perturbation can be used in the context of:
- Machine Learning: perturbation helps escape from local minima in the learning process. The decay factor is typically set to values smaller than 1 in order to test different degrees of perturbation.
- Cross Validation: perturbation introduces variability in data. As such, it is a kind of bootstrap algorithm with continuous values instead of integers. The decay factor is typically set to 1 to get the same degree of perturbation for all evaluations.
New Feature: Learning
Given these two different functions, we have added Data Perturbation as a new learning algorithm in order to take into account the specificities of the algorithm for machine learning.
The parameters are essentially the same as those that are available under Tools | Cross Validation.
Here, however, the output of the algorithm is not a Cross Validation Report. Rather, the output shows the best network learned during the iterations of the trial, as evaluated against the original data.