Uni-Logo
You are here: Home The Craft of Smoothing
Document Actions

The Craft of Smoothing

(Full day, 18 March, 9:00 – 17:00)

Paul H. C. Eilers1 and Brian D. Marx2

1 Erasmus University, The Netherlands

2 Louisiana State University, USA

 

Summary:

In the course, we describe in detail the basics and use of P-splines, as a combination of regression on a B-spline basis and difference penalties (on the B-spline coefficients). Our approach is practical. We see smoothing as an everyday tool for data analysis and statistics. We emphasize the use of modern software and we provide functions for R.

There will be six sessions and a lab exercise:

Session 1 presents the idea of bases for regression. It will show why global bases, like power functions or orthogonal polynomials are ineffective and why local bases (Gaussian bell-shaped curves or B-splines) are attractive.

In Session 2, penalties are introduced, as a tool to give complete and easy control over smoothness. The combination of B-splines and difference penalties will be studied for smoothing, interpolation and extrapolation. In these first two sessions the data are assumed to be normally distributed around a smooth curve.

In Session 3, we extend P-splines to non-normal data, like counts or a binomial response. The penalized regression framework makes it straightforward to transplant most ideas from generalized linear models to P-spline smoothing. Important applications are density estimation and variance smoothing.

Any smoothing method has to balance fidelity to the data and smoothness of the fitted curve. An optimal balance can be found by cross-validation or AIC. This subject is studied in Session 4, as well as the computation of error bands of an estimated curve. We also show how optimal smoothing performs on simulated data, to give you confidence in that it makes the right choices.

Session 5 places P-splines in a wider perspective. It presents Bayesian and mixed model interpretations of P-splines. Special attention is being paid to streamlined computation.

In the first five sessions we only consider one-dimensional smoothing. When there are multiple explanatory variables, we can use generalized additive models, varying-coefficient models, or combinations of them. Tensor products of B-splines and multi-dimensional difference penalties make an excellent tool for smoothing in two (or more) dimensions. This is the subject of session 6.

 

In addition there will be a computer demonstration + lab sessions, in which R software will be used to solve a number of smoothing problems. One exercise will concentrate on simple functions with limited goals. This will improve your understanding of what is going on “under the hood". This exercise will continue and apply smoothing to the generalized linear model and density estimation. Time permitting, a demonstration will be provided that uses the mgcv package, written by Simon Wood, a large but powerful tool that can handle a variety of situations, including generalized additive modeling. Time permitting we will continue with full 2D P-spline smoothing for normal and binomial responses.

 

Course Material:

Participants will need their laptops with R software, as well as the zip file that contains the P-spline functions, support functions, data sets, and exercises (provided by organizer). Hard copies of course notes will be provided.

 

Course fees:

Regular: 125 EUR

Student: 100 EUR

Personal tools