pcls                  package:mgcv                  R Documentation

_P_e_n_a_l_i_z_e_d _C_o_n_s_t_r_a_i_n_e_d _L_e_a_s_t _S_q_u_a_r_e_s _F_i_t_t_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Solves least squares problems with quadratic penalties subject to
     linear equality and inequality constraints using quadratic
     programming.

_U_s_a_g_e:

     pcls(M)

_A_r_g_u_m_e_n_t_s:

       M: is the single list argument to 'pcls'. It should have  the 
          following elements (last 3 are not in argument for mgcv) :

        _y The response data vector.

        _w A vector of weights for the data (often proportional to the 
             reciprocal of the variance). 

        _X The design matrix for the problem, note that 'ncol(M$X)' must
             give the number of model parameters, while 'nrow(M$X)' 
             should give the number of data.

        _C Matrix containing any linear equality constraints  on the
             problem (i.e. C in Cp=0). If you have no equality
             constraints initialize this to a zero by zero matrix.

        _S A one dimensional array containing the non-zero elements of
             the penalty matrices. Let 'start<-sum(M$df[1:(k-1)]^2)' or
             'start<-0' if 'k==1'. Then penalty matrix 'k' has
             'M$S[start+i+M$df[i]*(j-1)' on its ith row and jth column.
             The kth full penalty matrix is constructed by placing the
             matrix just derived into an appropriate matrix of zeroes
             with it's 1,1 element at 'M$off[k],M$off[k]'

        _o_f_f Offset values locating the elements of 'M$S[]' in the
             correct location within each penalty coefficient matrix.
             (Zero offset implies starting in first location)

        _d_f 'M$df[i]' gives the number of rows and columns of 'M$S[i]'
             that are non-zero.

        _s_p An array of  smoothing parameter estimates.

        _p An array of feasible initial parameter estimates - these must
             satisfy the constraints, but should avoid satisfying the
             inequality constraints as equality constraints.

        _A_i_n Matrix for the inequality constraints A_in p > b. 

        _b_i_n vector in the inequality constraints. 

_D_e_t_a_i_l_s:

     This solves the problem:


 minimise || W^0.5 (Xp-y) ||^2 +  lambda_1 p'S_1 p + lambda_1 p'S_2 p + . . .

     subject to constraints Cp=0 and A_in p > b_in, w.r.t. p given the
     smoothing parameters lambda_i. X is a design matrix, p a parameter
     vector,  y a data vector, W a diagonal weight matrix, S_i a
     positive semi-definite matrix  of coefficients defining the ith
     penalty and C a matrix of coefficients  defining the linear
     equality constraints on the problem. The smoothing parameters are
     the lambda_i. Note that X must be of full column rank, at least
     when projected  into the null space of any equality constraints.
     A_in is a matrix of coefficients defining the inequality
     constraints, while b_in is a vector involved in defining the
     inequality constraints.  

     Quadratic programming is used to perform the solution. The method
     used is designed for maximum stability with least squares
     problems: i.e. X'X is not formed explicitly. See Gill et al. 1981.

_V_a_l_u_e:

     The function returns an array containing the estimated parameter
     vector.

_A_u_t_h_o_r(_s):

     Simon N. Wood simon@stats.gla.ac.uk

_R_e_f_e_r_e_n_c_e_s:

     Gill, P.E., Murray, W. and Wright, M.H. (1981) Practical
     Optimization. Academic Press, London. 

     Wood, S.N. (1994) Monotonic smoothing splines fitted by cross
     validation SIAM Journal on Scientific Computing 15(5):1126-1133

     <URL: http://www.stats.gla.ac.uk/~simon/>

_S_e_e _A_l_s_o:

     'mgcv' 'mono.con'

_E_x_a_m_p_l_e_s:

     # first an un-penalized example - fit E(y)=a+bx subject to a>0
     set.seed(0)
     n<-100
     x<-runif(n);y<-x-0.2+rnorm(n)*0.1
     M<-list(X=matrix(0,n,2),p=c(0.1,0.5),off=array(0,0),df=array(0,0),S=0,
     Ain=matrix(0,1,2),bin=0,C=matrix(0,0,0),sp<-0,y=y,w=y*0+1)
     M$X[,1]<-1;M$X[,2]<-x;M$Ain[1,]<-c(1,0)
     pcls(M)->M$p
     plot(x,y);abline(M$p,col=2);abline(coef(lm(y~x)),col=3)

     # and now a penalized example: a monotonic penalized regression spline .....

     # Generate data from a monotonic truth.
     x<-runif(100)*4-1;x<-sort(x);
     f<-exp(4*x)/(1+exp(4*x));y<-f+rnorm(100)*0.1;plot(x,y)
     dat<-data.frame(x=x,y=y)
     # Show regular spline fit (and save fitted object)
     f.ug<-gam(y~s(x,k=10,bs="cr"));lines(x,fitted(f.ug))
     # Create Design matrix, constriants etc. for monotonic spline....
     gam.setup(y~s(x,k=10,bs="cr")-1,dat,fit.method="mgcv")->G;
     GAMsetup(G)->G;F<-mono.con(G$xp);
     G$Ain<-F$A;G$bin<-F$b;G$C<-matrix(0,0,0);G$sp<-f.ug$sp;
     G$p<-G$xp;G$y<-y;G$w<-y*0+1;

     p<-pcls(G);  # fit spline (using s.p. from unconstrained fit)

     # now modify the gam object from unconstrained fit a little, to use it
     # for predicting and plotting constrained fit. 
     p<-c(0,p);f.ug$coefficients<-p; 
     lines(x,predict.gam(f.ug,newdata=data.frame(x=x)),col=2)

     # now a tprs example of the same thing....

     f.ug<-gam(y~s(x,k=10));lines(x,fitted(f.ug))
     # Create Design matrix, constriants etc. for monotonic spline....
     gam.setup(y~s(x,k=10),dat,fit.method="mgcv")->G;
     GAMsetup(G)->G;
     nc<-40         # number of constraints
     xc<-0:nc/nc # points on [0,1]  
     xc<-xc*4-1  # points at which to impose constraints
     A0<-predict.gam(f.ug,data.frame(x=xc),type="lpmatrix") 
     # ... A0
     A1<-predict.gam(f.ug,data.frame(x=xc+1e-6),type="lpmatrix") 
     A<-(A1-A0)/1e-6    
     # ... approx. constraint matrix (A
     G$Ain<-A;    # constraint matrix
     G$bin<-rep(0,nc);  # constraint vector
     G$sp<-f.ug$sp; # use smoothing parameters from un-constrained fit
     G$p<-rep(0,11);G$p[11]<-0.1  
     # ... monotonic start params, got by setting coefs of polynomial part
     G$p[10]<- -mean(0.1*x) 
     # ... must meet gam side conditions: sum of smooth over x's is zero 
     G$y<-y;G$w<-y*0+1
     p<-pcls(G);  # fit spline (using s.p. from unconstrained fit)

     # now modify the gam object from unconstrained fit a little, to use it
     # for predicting and plotting constrained fit. 
     f.ug$coefficients<-p; 
     lines(x,predict.gam(f.ug,newdata=data.frame(x=x)),col=3)

