glsw Documentation


PLS_Toolbox Documentation: glsw	< getdatasource	gram >

glsw

Purpose

Calculate or apply Generalized Least Squares weighting.

Synopsis

modl = glsw(x,a); %GLS on matrix

modl = glsw(x1,x2,a); %GLS between two data sets

modl = glsw(x,y,a); %GLS on matrix in groups based on y

xt = glsw(newx,modl,options); %apply correction

Description

Uses Generalized Least Squares to down-weight variable features identified from the singular value decomposition of a data matrix. The input data usually represents measured populations which should otherwise be the same (e.g. the same samples measured on two different analyzers or using two different solvents) and can be input in one of several forms, as explained below.

If the SVD of the input matrix x is X=USV^T then the deweighting matrix is estimated with the following pseudo-inverse W= Udiag(sqrt(1/(diag(S)/a²+1)))V^T, where the center term defines S_inv. The adjustable parameter a is used to scale the singular values prior to calculating their inverse. As a gets larger, the extent of deweighting decreases (because S_inv approaches 1). As a gets smaller (e.g. 0.1 to 0.001) the extent of deweighting increases (because S_inv approaches 0) and the deweighting includes increasing amounts of the the directions represented by smaller singular values.

A good initial guess for a is 1x10^-2 but will vary depending on the covariance structure of X and the specific application.

For calibration, inputs can be provided by one of three methods:

1) x = data matrix containing features to be downweighted, and

a = scalar parameter limiting downweighting {default = 1e-2}.

2) x1 = a M by N data matrix and

x2 = a M by N data matrix.

The row-by-row differences between x1 and x2 will be used to estimate the downweighting.

a = scalar parameter limiting downweighting {default = 1e-2}.

3) x = a MxN data matrix,

y = column vector with M rows which specifies sample groups in x within which differences should be downweighted, and

a = scalar parameter limiting downweighting {default = 1e-2}.

An options structure can be used in place of (a) for any call or as the third output in an apply call. This structure consists of any of the fields:

a: [{0.02}] scalar parameter limiting downweighting {default = 1e-2},

columngradient: [ {'no'} | 'yes' ] governs the use of sample similarity, as determined by similarity in provided y value, to determine the deweighting matrix (Only applicable when y is supplied).

applymean: [ 'no' | {'yes'} ] governs the use of the mean difference calculated between two instruments (difference between two instruments mode). When appling a GLS filter to data collected on the x1 instrument, the mean should NOT be applied. Data collected on the SECOND instrument should have the mean applied.

When applying a GLSW model the inputs are newx, the x-block to be deweighted, and modl, a GLSW model structure.

Outputs are modl, a GLSW model structure, and xt, the deweighted x-block.