Friday, October 5, 2018

An interesting linear algebra problem from signal processing. 

This problem was presented on https://stackoverflow.com/questions/52373413/solve-underdetermined-system-of-equations-for-sparse-solution/52404876#52404876 
It appears that a discussion devoted to it might be interesting, in order to analyse possible approaches to solution. 



We approach the problem by transforming the equation into a form more "recognizable" from the classic linear algebra approach to the solution of systems of linear equations. 


The last expression, essentially, represents a set of m systems of linear equations with one and the same matrix BT and different right-hand sides, which are the columns of CT. In order to proceed further, we need to perform Gaussian elimination for the matrix BT , simultaneously with all the right-hand sides, using some pivoting algorithm. As the matrix is singular, the resulting system of equations, for each particular column j from CT and, correspondingly, for each particular column j from AT, will have the following form: 



Since the matrix of the system is singular, k<n , where k=rank(BT) . The vector of unknowns consists of two smaller vectors: 
 of the size k , containing the components from 1st to k-th of the column j of the AT
                matrix;
  of the size n-k, containing the components from (k+1)th to n-th of the same                           column of  AT

Correspondingly:
  of the size k, containing (permuted, due to the Gaussian elimination with column                    pivoting performed) components of the j-th column of CT ;
  of the size n-k, containing the remaining components  of the j-th column of CT . 

Therefore, upon completion of the Gaussian elimination, the last few equations in the system are trivial:




Therefore, two possibilities only exist: 


1) All of the components of the lower part of the right-hand side vector are zeros:


In that case, the system has infinite number of solutions, since any alpha-vector would satisfy the subsystem. Since our goal is to find a solution of the original system as "sparse" as possible, we can safely set all components of the alpha-vector to zeros, and solve the remaining equations by backward substitution, thus obtaining a unique solution. 


2) At least one component of the right-hand side (ie. of the chi-vector) of the above subsystem is different from zero. In that case, the subsystem has no solutions at all, and, therefore, the original system has no solutions as well. In such case, the only way to proceed is to try to find an approximation to the solution in terms of the least squares. 

We will perform the analysis from the very beginning, using a well-established method of Singular Value Decomposition, or SVD for short. Given a system of linear equations, say, Fx=g , matrix F can be represented as the product of multiplication of three matrices (we limit ourselves by case of square matrix F , i.e. in the context of the current problem): 



In the context of the problem which is being discussed here, the diagonal matrix does contain at least one zero element on the diagonal, since the original matrix is supposed to be singular. Here, we will consider a simple SLAE of the size 5x5 where 
- two rows of the matrix are linearly dependent, and
- the system is incompatible, so that no exact solution exists, and therefore we will try to find a least squares approximation to the solution.
The 5-th row of the matrix equals the product of multiplication of the 3rd row by the factor of -2. Now, if we select the right-hand side of the system in such a way that the 5th element of the vector g does not equal the 3rd element multiplied by -2 , the whole SLAE is incompatible, and we can only try to find a least squares approximation to the solution. 

Lets put the right-hand side vector g as follows: g = [ 1 ,  2,  3,  4,  5 ]
The SLAE is, obviously, incompatible. 

Now, what the SVD algorithm delivers:




The elements of the diagonal matrix, named here as singular values, contain one zero element, as expected. The SVD-algorithm prescribes the following procedure to obtain a least-squares solution of the system:

1) Introduce vector d=UTg .

2) Calculate vector z as follows:
   
                                zj = dj / sj  , if sj is a non-zero value;
                                zj can be chosen arbitrarily, if sj = 0. 

3) The solution is obtained as the following product: x=Vz

And here comes the beauty of the SVD-method: regardless of the chosen value of zj for which  sj = 0, the program will deliver a solution vector from the R5 space such that the norm of the residuals vector will be the same, and it will be the least possible value for this system. (Obviously, if the system was compatible, the residuals would all equal zero). 

We provide just three cases to demonstrate that for our example. Namely, we set z5 to 0., to 1., and to a randomly selected value, and see what happens. 


As we can see, each particular choice of z5 leads to a different solution vector. Nevertheless, the residuals vector is one and the same in all cases, which means that all three solutions are equally valid in terms of least squares. 

Since there is an infinite number of least-squares solutions, each particular one corresponding to one particular value of chosen z5, we can probably consider an opposite task: set one component of the solution vector to zero, and find another components such that the vector of the residuals will stay the same (obviously, some particular value of z will be calculated along the way). The more singular values si  equal zero, the more components of the solution vector can theoretically be made equal zero. We should recall that our original goal was to deliver as sparse matrix A as possible. We consider that as possible subject of further algorithm development. 




                       











Thursday, September 20, 2018

Solution of least squares problems using normal equations. 

Least squares problems assume that an over-determined system of linear algebraic equations needs to be solved in a way that the solution vector somehow minimizes the residuals vector that arises when the actual right-hand side of the system is subtracted from the product of the system's matrix and the obtained solution. 

A system of linear algebraic equations is over-determined if the number of rows in the system exceeds the number of columns (a case when some of the equations are linearly dependent is omitted for simplicity). Solving such system in classical sense is impossible; however, a method is known that transforms such system into a system of equations with square, symmetric, positive definite matrix. The obtained system is called a system of normal equations. The procedure is as follows: the original system is multiplied from the left by the original matrix transpose. 

The resulting matrix of the system, since it is now symmetric and positive definite, allows application of the Conjugate Gradients Method (or CGM). That fact has a very big significance for large-scale problems with sparse matrices: the CGM allows solution of systems consisting of millions of equations, and the matrix can be stored completely in the computer's Random Access Memory (or RAM) as long as the matrix is sparse (ie. if it contains, say, a few dozens of non-zero elements per row). 

In order to illustrate how the CGM can be applied to over-determined matrices, and how the numerical quality of the solution can be evaluated, we consider a rather simple example.



In order to obtain a solution, the approach based on the use of normal equations  and the Conjugate Gradients Method has been used. The solver was written in C and run on a computer with Intel Celeron N3450 CPU. 



An issue of special importance is verification of the solution. Even if the CGM iterations have converged, we can have reasonable doubts whether the obtained solution is numerically solid. The way to perform such verification is to multiply the original matrix by the solution vector and check: how different the obtained right-hand side (or RHS) is from the original one. As it can be seen from the example above, the solution is trustworthy, since the b[i] vector components aren't different from the original ones. 

Wednesday, September 5, 2018

Solving of Systems of Linear Algebraic Equations with Very Large Sparse Matrices.


Obtaining numerical solutions of partial differential equations using the Finite Difference or Finite Element Method inevitably brings up a necessity to solve systems of linear algebraic equations with very large, sparse matrices. 


How actually large is "large" ? Several tens of thousands of equations is a pretty ordinary system for modern computers. However, real-life problems arising from the use of PDE, especially three-dimensional, can lead to systems with much larger matrices. Two or three million equations, involving a sparse matrix, do represent a problem, unless you have an access to a computer somewhat more powerful than an ordinary desktop. 

But you still can use your ordinary desktop computer to solve such system. We set a goal to develop a code capable of solving such large problems. Below, we discuss an example. 

Consider calculation of deflections of a perforated panel using the Finite Element Method. The model presented here is a square-shaped panel, with two (vertical) edges clamped and a distributed lateral pressure applied to each finite element. The panel contains a set of 70 openings. The finite element model has the following properties: 


We selected this particular example for a purpose. It is known that iterative processes converge faster
or slower depending on the matrix spectral properties. Thin-walled structures are notoriously
known as having rather ill-conditioned matrices, which can make the convergence slow.
Nevertheless, our code has demonstrated excellent performance.

All technical parameters of the computer used and the obtained results are presented below.



In order to demonstrate the influence of the matrix spectral properties on the convergence, we also ran 
the same model but this time prepared for a thick plate: the thickness was selected to be 40 mm instead 
of three (the lateral load was also increased to 10^5 N/m^2). In this case the convergence to solution 
was achieved in just 1 min. 45 sec. !

Why the above comparison is important ? The Finite Element Method is often used for the analysis 
of three-dimensional models consisting of tetrahedral and/or hexahedral elements. It turned out that the 
spectral properties of the corresponding stiffness matrices demonstrate behavior similar to the second 
case from the above. Thin plates are "more difficult" from the convergence point of view than thick plates
or 3D models. 

Our next example is a perforated panel similar to the previous one, but with significantly larger number 
of quadrilateral elements. This time the model has twenty rows and twenty five columns of openings.




                   


This problem was aslo solved by our code; the results are presented above. It is interesting to note
that this time the convergence rate (which is assumed to be a ratio between the number of equations
and the number of iterations performed for convergence) was much higher than in the previous
example. As mentioned before, after a number of additional numerical experiments we found that
the principal contributing factor was the panel thickness: modeling of thick plates by Mindlin elements
leads to the stiffness matrices with significantly better spectral properties than modeling of thin plates,
when the matrices can have relatively large condition ratios, leading to slower convergence of the iterations.

The above examples demonstrate that very large problems (of up to several millions of equations
with sparse matrices) can be successfully solved on ordinary PCs, with the whole process taking
place in the Random Access Memory. No data transfer between the RAM and the hard disk 
is required.

The readers are invited to actively participate in the discussion and share their own experiences
in solving systems of linear equations with large sparse matrices.