Doptimal
Contents |
Purpose
Selects samples from a candidate matrix that satisfy the d-optimal condition.
Synopsis
- isel = doptimal(x,nosamps,iint,tol)
Description
DOPTIMAL selects a number (nosamps) of samples from a candidate matrix x that maximizes the determinant of det(x(isel,:)'\*x(isel,:)) where isel is a vector of indices of the selected samples.
The optional input iint is a vector of indices to initialize the optimization algorithm. If iint is not input the algorithm is initialized using samples identified as on the exterior of the data set using the DISTSLCT function. This is in contrast to initializing with a random subset used in many algorithms. The reason is that the routine is based on Fedorov's algorithm (de Aguiar, P.F., Bourguignon, B., Khots, M.S., Massart, D.L., and Phan-Than-Luu, R., "D-optimal designs", Chemo. Intell. Lab. Sys., 30, 199-210, 1995) which requires calculating inv(x(isel,:)'\*x(isel,:)), and it is possible that the inverse of a random set will not exist. The routine then exchanges the 'least informative' sample in the selected set with a 'more informative' sample in the candidate set. The optional input tol sets the tolerance for minimum increase in the determinant {default = 1x10-4}.
Note that nosamps must be > rank(x) (it is necessary but not sufficient that nosamps > size(x,2)) for a good solution to be found. This is required so that a good estimate of inv(x(isel,:)'\*x(isel,:)) can be obtained. When nosamps > size(x,2) the scores from PCA or PLS can be used where nosamps > than the number of factors (principal components or latent variables) used. Also, note that the solution can depend on the initial guess and that isel does not necessarily represent a global optimum.
Inputs
- x: data matrix
- nosamps: number of samples to select
Optional Inputs
- iint: vector of initialization indices
- tol: tolerance for minimum increase in the determinant {default: 1x10-4}
Outputs
- isel: vector of selected indices
Examples
For an input matrix x that is m by 5
isel5 = doptimal(x,5); isel6 = doptimal(x,6);