Web Toolbar by Wibiya Data Stat: Regression equation

Sunday, November 23, 2008

Regression equation

It is convenient to assume an environment in which an experiment is performed: the dependent variable is then outcome of a measurement.

The regression equation deals with the following variables:
• The unknown parameters denoted as β. This may be a scalar or a vector of length k.
• The independent variables, X.
• The dependent variable, Y.

Regression equation is a function of variables X and β.

The user of regression analysis must make an intelligent guess about this function. Sometimes the form of this function is known, sometimes he must apply a trial and error process.
Assume now that the vector of unknown parameters, β is of length k. In order to perform a regression analysis the user must provide information about the dependent variable Y:

• If the user performs the measurement N times, where N < k, regression analysis cannot be performed: there is not provided enough information to do so.

• If the user performs N independent measurements, where N = k, then the problem reduces to solving a set of N equations with N unknowns β.

• If, on the other hand, the user provides results of N independent measurements, where N > k regression analysis can be performed. Such a system is also called an overdetermined system;

In the last case the regression analysis provides the tools for:

1. finding a solution for unknown parameters β that will, for example, minimize the distance between the measured and predicted values of the dependent variable Y (also known as method of least squares).

2. under certain statistical assumptions the regression analysis uses the surplus of information to provide statistical information about the unknown parameters β and predicted values of the dependent variable Y.

No comments: