Permutetest

From Eigenvector Documentation Wiki
Jump to: navigation, search

Contents

Purpose

Permutation testing for regression and classification models.

Synopsis

results = permutetest(x,y,rm,cvi,ncomp,options)

Description

Performs permutation test where the y-block is shuffled allowing the calculation of probability that the results obtained with the unperturbed y-block are significant or not (as compared to random chance). Inputs are identical to the standard call to crossvalidation except that the number of iterations provided in the cvi input are used for permutation of the y-block instead of permutation of the leave-out sets.

In addition to storing all Root Mean Square Error of Calibration (RMSEC) and cross-validation (RMSECV), the self-predicted and cross-validated residuals of each permutation are compared to the original residuals using the following tests:

Wilcoxon test
Sign test
Randomized t-test

T-test These tests give probability of similarity of the two sets of residuals. Thus, a low probability indicates the perturbed results are significantly different from the original model and, thus, the original model is significant. Note that many of these can provide valid results even with very few iterations. However, more iterations improve results and also permit better plots.

When requested the final plot of fractional y-block information captured by calibration and cross-validation versus y-correlation is shown. For more information, see permuteplot.

Inputs

  • x = X-block data to be tested (DataSet or Double)
  • y = Y-block data to be tested (DataSet, Double or logical)
  • rm = regression method as defined in crossval
  • cvi = Cell array defining data split method and total permutation iterations to be performed: {'method' splits iterations}
  • ncomp = Maximum number of latent variables to be tested

Optional Inputs

  • model = existing PCA model, onto which new data x is to be applied.
  • options = discussed below.

Outputs

  • results: A structure with the following fields. Sizes are defined using lvs = number of latent variables, iter = number of iterations performed, and ny = number of y-block columns. All fields are size = (lvs iterations ny) unless stated otherwise.
  • rmsecvperm: RMSECV for each permuted y-block.
  • rmsecperm: RMSEC for each permuted y-block.
  • rmsecv: RMSECV for the original unpermuted y-block.
Size = [lvs 1 ny]
  • rmsec: RMSEC for the original unpermuted y-block.
Size = [lvs 1 ny]
  • cvprob: Probabilities calculated for cross-validated residuals. Sub-fields indicate method (defined above).
Sizes all = [lvs ny]
  • cprob: Probabilities calculated for self-predicted residuals. Sub-fields indicate method (defined above).
Sizes all = [lvs ny]
  • ycor: Correlation of each original y-block column (rows here) with each permuted y-block (columns).
  • rmsy: Root Mean Square of each y-block column.
  • y: The original unpermuted y-block.

Options

options = Options structure with one or more of the fields defined in crossval. (See crossval for details on the options). In addition, the following fields are defined for special use in this function:

  • plotlvs = [ 1 ] Model size (in latent variables) to show in final display table and plots. The results for the model with the corresponding number of latent variables will be shown.

See Also

crossval, permuteplot, permuteprobs

Views
Personal tools