Thursday, September 20, 2018

Solution of least squares problems using normal equations. 

Least squares problems assume that an over-determined system of linear algebraic equations needs to be solved in a way that the solution vector somehow minimizes the residuals vector that arises when the actual right-hand side of the system is subtracted from the product of the system's matrix and the obtained solution. 

A system of linear algebraic equations is over-determined if the number of rows in the system exceeds the number of columns (a case when some of the equations are linearly dependent is omitted for simplicity). Solving such system in classical sense is impossible; however, a method is known that transforms such system into a system of equations with square, symmetric, positive definite matrix. The obtained system is called a system of normal equations. The procedure is as follows: the original system is multiplied from the left by the original matrix transpose. 

The resulting matrix of the system, since it is now symmetric and positive definite, allows application of the Conjugate Gradients Method (or CGM). That fact has a very big significance for large-scale problems with sparse matrices: the CGM allows solution of systems consisting of millions of equations, and the matrix can be stored completely in the computer's Random Access Memory (or RAM) as long as the matrix is sparse (ie. if it contains, say, a few dozens of non-zero elements per row). 

In order to illustrate how the CGM can be applied to over-determined matrices, and how the numerical quality of the solution can be evaluated, we consider a rather simple example.



In order to obtain a solution, the approach based on the use of normal equations  and the Conjugate Gradients Method has been used. The solver was written in C and run on a computer with Intel Celeron N3450 CPU. 



An issue of special importance is verification of the solution. Even if the CGM iterations have converged, we can have reasonable doubts whether the obtained solution is numerically solid. The way to perform such verification is to multiply the original matrix by the solution vector and check: how different the obtained right-hand side (or RHS) is from the original one. As it can be seen from the example above, the solution is trustworthy, since the b[i] vector components aren't different from the original ones.