require(splines) ###Splines bone <- read.table("http://www-stat-class.stanford.edu/~tibs/ElemStatLearn/datasets/bone.data",header=TRUE) boneMaleSmooth <- smooth.spline(bone[bone$gender=="male","age"],bone[bone$gender=="male","spnbmd"],df=12) plot(boneMaleSmooth,ylim=c(-0.05,0.20),col="blue",type="l") points(bone[bone$gender=="male",c(2,4)],col="blue",pch=20) boneFemaleSmooth <- smooth.spline(bone[bone$gender=="female","age"],bone[bone$gender=="female","spnbmd"],df=12) plot(boneFemaleSmooth,ylim=c(-0.05,0.20),col="red",type="l") points(bone[bone$gender=="female",c(2,4)],col="red",pch=20) ###For information on the computation of B-splines see ?bs ###The remaining stuff here was not showed. It is an attempt to ###illustrate the difference between a class of smooth, nice ###but random functions and the dimensionality of the column space ###they can span and the class of smooth, nice functions from a ###given, finite basis expansion and the column space that ###they can span. I am actually not completely convinced of how ###useful this is. ###This function produces a random function that is ###as smooth as determined by the band-width bw of the ###smoother smoother <- function(x,y,bw=0.01) exp(-(x-y)^2/bw) fRandom <- function(x,coef=rnorm(length(x))) { outer(x,x,smoother) %*% coef } curve(fRandom,0,5) ###This function is also random but given by the expansion ###with random coefficients in a cosine basis with l basis ###functions. fComplicated <- function(x,l=20,coef=rnorm(l)) { return(cos(outer(x,1:l,"*"))%*%coef) } curve(fComplicated,0,5) ###By computing a sample of functions evaluated at a certain grid ###and then computing the singular values from the singular value ###decomposition we get an impression of how geometrically independent ###the samples are. For the comlicated function there is a sharp ###cut at 20 dimensions -- by construction the vectors generated belong to ###a 20 dimensional space. For the random function there is a steady ###decrease. In the former case we are better of when trying to find a ###small basis expansion for the function. N <- 500 p <- 40 tmpComplicated <- replicate(p,fComplicated(seq(0,5,length.out=N))) plot(svd(tmpComplicated)$d) tmpRandom <- replicate(p,fRandom(seq(0,5,length.out=N))) plot(svd(tmpRandom)$d)