Polynomial Regression

Exercices

1 Preliminary

Only functions from R-base and stats (preloaded) are required plus packages from the tidyverse for data representation and manipulation.

library(tidyverse)
library(ggfortify) # extend some ggplot2 features
theme_set(theme_bw())

2 Introduction

The file ratWeight.csv consists of rat weights measured over 14 weeks during a subchronic toxicity study related to the question of genetically modified (GM) corn.

We will only consider the weight of rat B38625.

rat_weight <- read_csv("../../data/ratWeight.csv") %>% filter(id == 'B38625')
rat_weight %>% rmarkdown::paged_table()

Based on this data, our objective is to build a regression model of the form

y_j = f(x_j) + e_j \quad ; \quad 1 \leq j \leq n

We will restrict ourselves to polynomial regression, by considering functions of the form

\begin{aligned} f(x) &= f(x ; c_0, c_1, c_2, \ldots, c_d) \\ &= c_0 + c_1 x + c_2 x^2 + \ldots + c_d x^d \end{aligned}

3 Questions

  1. Plot the data

  2. Fit several polynomials (degree 0, 1, 2, 3, 6, …) to this data using the lm function.

  3. Look at some diagnostic plots to eliminate miss-specified models.

  4. Compare the predictive performance of the models that have been retained.

  5. Use statistical tests and information criteria to compare the models and select “the best one”.