auto
Purpose
Autoscales a matrix to mean zero and unit variance.
Synopsis
[ax,mx,stdx,msg] = auto(x,options)
[ax,mx,stdx,msg] = auto(x,offset)
options = auto('options')
Description
[ax,mx,stdx]
= auto(x) autoscales a matrix x and returns the resulting matrix ax with mean-zero unit variance columns, a vector of
means mx and a vector
of standard deviations stdx
used in the scaling. Output msg
returns any warning messages. If missing data NaNs are found, the available data is autoscaled if
the fraction missing is not above the thresholds specified below. mx and stdx can be used to scale
new data (see SCALE).
Options
options = a structure array with the following fields:
offset: scaling can use standard deviation plus an offset
{default = 0},
display: [
{'off'}| 'on' ] governs level of display to the command window,
matrix_threshold: fraction
of missing data allowed based on entire matrix (x) {default = 0.15}, and
column_threshold: fraction
of missing data allowed base on a single column {default = 0.25}.
algorithm: [ {'standard'} | 'robust']
scaling algorithm. 'robust' uses MADC for scaling and median instead of mean.
Should be used for robust techniques,
stdthreshold: [ 0 ] scalar or vector of
standard deviation threshold values. If a standard deviation is below its corresponding
threshold value, the threshold value will be used in lieu of the actual value.
Note that the actual standard deviation is always returned, whether or not it
exceedes the threshold. A scalar value is used as a threshold for all
variables,
badreplacement: [0]
value to use in place of standard deviation values of 0 (zero). Typical values
used with the following effects:
0
= Any value in given variable is set to zero. Variable is effectively excluded
(but still expected by model). This is also the behavior when badreplacement =
inf.
1 = Values different from
mean of the given variable are flagged in Q residuals with no reweighting.
Values >0 and <inf
give the variable different weighting in the Q residuals (values >1
down-weight the bad variables for Q residual calculations, values <1
up-weight the bad variables.).
If the input (offset) is a scalar then, this is used as the
offset value with other options set at their default values.
The optional input offset is added to the standard
deviations before scaling and can be used to suppress low-level variables that
would otherwise have standard deviations near zero.
The default options can be retreived using: options = auto('options');.
See Also
gscale, medcn, mncn, normaliz, npreprocess, regcon, rescale, scale, snv