PLS_Toolbox Documentation: preprocess | < polypred | preprouser > |
preprocess
Purpose
Selection and application of preprocessing methods.
Synopsis
s = preprocess(s) %GUI preprocessing selection
s = preprocess('default','methodname') %Non-GUI selection
[datap,sp] = preprocess('calibrate',s,data) %single block calibrate
[datap,sp] = preprocess('calibrate',s,xblock,yblock) %multi-block
datap = preprocess('apply',sp,data) %apply to new data
data = preprocess('undo',sp,datap) %undo preprocessing
Description
PREPROCESS is a general tool to choose preprocessing steps and to perform these steps on data. See PREPROUSER for a description on how custom preprpocessing can be added to the standard proprocessings listed below. PREPROCESS has four basic command-line forms which include:
1) SELECTION OF PREPROCESSING.
The purpose of the following calls to PREPROCESS is to generate standard structure arrays that contain the desired preprocessing steps.
s = preprocess;
generates a GUI and allows the user to select preprocessing steps interactively. The output s is a standard preprocessing structure.
s = preprocess(s);
allows the user to interactively edit a previously identified preprocessing structure s. The output s is the edited preprocessing structure.
s = preprocess('default','methodname');
returns the default structure for method 'methodname'. A list of strings that can be used for 'methodname' can be viewed using the command:
preprocess('keywords')
A list of standard methods 'methodname' follow:
The output is a standard preprocessing structure array s where each method to apply is a separate record.
2) CALIBRATE.
The objective of the following calls to PREPROCESS is to estimate preprocessing parameters, if any, from a calibration data set and perform preprocessing on the calibration data set. The I/O format is:
[datap,sp] = preprocess('calibrate',s,data);
The inputs are s a standard preprocessing structure and data the calibration data. The preprocessed data is returned in datap, and preprocessing parameters are returned in a modified preprocessing structure sp. Note that sp is used as an input with the 'apply' and 'undo' commands described below.
Short cuts for each method can also be used. Examples for 'mean center' and 'autoscale' are
[datap,sp] = preprocess('calibrate','mean center',data);
[datap,sp] = preprocess('calibrate','autoscale',data);
Preprocessing for some multi-block methods require that the y-block be passed also. The I/O format in these cases is:
[datap,sp] = preprocess('calibrate',s,xblock,yblock);
Preprocessing 'methodname' that require a y-block are:
'osc'
'gls weighting'
3) APPLY.
The objective of the following call to PREPROCESS
datap = preprocess('apply',sp,data)
is to apply the calibrated preprocessing in sp to new data. Inputs are sp the modified preprocessing structure (See 2 above) and the data, data, to apply the preprocessing to. The output is preprocessed data datap that is class “dataset”.
4) UNDO.
The inverse of applying preprocessing is perfromed in the following call to PREPROCESS
data = preprocess('undo',sp,datap);
Inputs are sp the modified preprocessing structure (See 2 above) and the data, datap, (class “double” or “dataset”) from which the preprocessing is removed. Note that for some preprocessing methods an inverse does not exist or has not been defined and an 'undo' call will cause an error to occur. For example, 'osc' and 'sg'. One reason for not defining an inverse, or undo, is because it would require a significant amount of memory storage when data sets get large.
See Also
crossval, pca, pcr, pls, preprocatalog, preprouser
< polypred | preprouser > |