## Multiple regression in a spreadsheet

This illustration shows a typical set of data collected over successive weeks and
tabulated in a spreadsheet. Column A contains the kWh consumption figures, and
columns B,C, and D contain the corresponding driving-factor quantities:

In the spreadsheet program's regression analysis tool, the column of consumption figures
(shown in red) are the 'dependent variable' or 'y range'. The block of figures in blue
represents the 'independent variables' or 'x range'. The x range may span several columns, as
shown here, but it would be a single column if there were only one driving factor.

If a particular value in column A were not known, it could be estimated by substituting that row's known values
for D_{1}, D_{2} and D_{3} in the equation

`
Column A estimate = k`_{0} + k_{1}.D_{1} + k_{2}.D_{2} + k_{3}.D_{3}

The purpose of the regression analysis tool is to find values for
k_{0}, k_{1}, k_{2}, and so on, which give the least error when estimating
the column-A values.