Variable selection: Forward or backward
Model selection:
\begin{itemize}
\item All subset regression
\item Forward stepwise regression
\item Backwards stepwise regression
\end{itemize}
Explanation
R commands: add1, drop1, summary - followed by anova(fm.final,fm.full)
{\bf Example:} Use the ecosystem data set and conduct a forward stepwise regression to predict the growth of cod. Compare the various criteria for model selection.
{\bf Example:} Use the ecosystem data set and conduct a backwards stepwise regression to predict the growth of cod. Compare the various criteria for model selection.
Details
Several methods exist to select a regression model.\\
All subset regression simply considers every possible combination of
independent variables. Although this will indicate all possible
``good'' models and will certainly find the ``best'' model (using any
given criterion), this is often not feasible.\\
Backwards stepwise regression starts by taking all independent
variables into a single model and then dropping variables one at a
time. The variable to be dropped is the one giving the least increase
in SSE. This approach is often preferred, but is not feasible if
the total number of variables are very large.\\
Forward stepwise regression selects a sequence of variables, at each
stage deciding what variable to add next. The addition is based on
including the variable giving the largest amount of (marginal)
explained variation.\\
Forward stepwise regression is often augmented by allowing a variable
to be dropped after a variable has been added. Thus a sequence of
insertions may make an earlier variable redundant and thus dropped.
Either version of forward regression is quite feasible but may lead to
an incorrect or bad model since important combinations of variables
may not be found.\\
Each approach thus has good and bad points.
Examples
\begin{xmpl}
Use the ecosystem data set and conduct a forward stepwise regression to predict the growth of cod. Compare the various criteria for model selection.
R commands: add1 repeatedly - followed by anova(fm.final,fm.full)
\end{xmpl}
\begin{xmpl}
Use the ecosystem data set and conduct a backwards stepwise regression to predict the growth of cod. Compare the various criteria for model selection.
R commands: drop1 or summary - followed by anova(fm.final,fm.full)
\end{xmpl}