Sunday 28 August 2011

Matrix Operations in R

A matrix is a two-way array of numbers. Most computations in OpenMx are done on matrices, which allow complex computations on data arrays to be done in a very efficient manner.
In the previous script you generated two kinds of matrices. The first was 'mydata', the matrix of values for X and Y. The main way you will use such matrices in OpenMx is to import raw data for analysis, which are then used to extra summary statistics. You have already encountered the commands summary, cov, cor, and sd, which provide basic statistics for  a data array. The main additional thing you need to know how to do is to select a subset of data from a data matrix.

Let us first generate a matrix of random numbers. The rnorm function generates random numbers. If we just want 200 random numbers, we could type:
   myrand=rnorm(200)
Type myrand in the console to inspect what you have created. Although the output may appear like a table, that's just the way it's formatted. In fact, it's just a long list of numbers (a vector), whereas we want the data in rows and columns. To achieve this, we use the 'matrix' command, i.e.:
     myarray=matrix(rnorm(200),nrow=40)
Note that we need to specify the number of rows in the matrix, but R can then work out how many columns. We now want to make a new variable, myarray2, which just has columns 2 and 3 from the original array:
     myarray2=matrix(c(myarray[,2:3]),nrow=40) # Format to have 40 rows and 2 columns
When selecting from myarray, we could have used c(myarray(1:40,2:3), to specify that we wanted all 40 rows and columns 2-3. However, since we wanted all the rows, we just left the row specification blank, which tells R to take the whole row dimension. You should experiment with selecting from myarray in other ways, .e.g. selecting rows 10-20.

The bulk of the computational work on matrices in R will be done on covariance matrices. We shall illustrate the different operations on a simple 3 x 3 matrix, which you enter as follows:
          mymat = matrix(c(1, 1, 2, 3, 1, 2, 4, 3, 1), nrow=3)
Check the output and you will see that the vector of numbers is converted to a 3 x 3 matrix by the use of the matrix command, with nrow specified as 3. But is there something surprising about the result?  You might have expected the values to be entered into the matrix one row at a time, whereas you will see they are entered by columns.  You can change this by giving a specific instruction when setting up the matrix. Compare mymat with mymat2:
mymat2= matrix(c(1, 1, 2, 3, 1, 2, 4, 3, 1), nrow=3, byrow=TRUE)
Individual elements can be extracted from a matrix mymat by specifying mymat[i,j],
which extracts the element in the ith row and jth column.

Transpose:  A matrix is transposed by exchanging rows and columns. In our example, mymat2 is a transpose of mymat. In common usage, the transpose of matrix X is written as X'.
The transpose operation is performed in R with the command t().
Try: 
                t(mymat)
If you start with a 2 x 3 matrix, its transpose with have 3 rows and 2 columns.

Other matrix operations
If you aren't familiar with matrix algebra, you may prefer to just skim this section and return to it as needed. Matrix algebra can fry your brain, and It isn't vital at this point that you understand in detail how all the matrix operations work. However, it is important that you can recognise the symbols for matrix operations in the scripts you encounter. And ultimately, if you stay the course and start to get serious about using OpenMx, you will need to master this material.


Matrix addition and subtraction: these are straightforward, and done element-by-element on two matrices of the same dimensionality. Try, for instance, mymat+mymat2

Matrix multiplication. This is not so simple. You may think it is, because if you type mymat * mymat2, you  will get an answer analogous to addition, with each element of the first matrix multiplied by its corresponding element in the second matrix. However, this is NOT true matrix multiplication.  The operator for matrix multiplication involves three characters, as in this example:
                 mymat%*%mymat2
The explanation below borrows heavily from this excellent account of matrix multiplication.
Let us first define two matrices, A and B, as below. You should by now be able to work out how to create these matrices in R.
A=  [ 1  0  -2
        0  3  -1]

B = [  0   3
        -2 -1
         0  4]
To multiply the two matrices, multiply the ROWS of A by the COLUMNS of B. First take the first row of A and the first column of B, and multiply the first entries, then the second entries, and then the third entries, and then add the three products. The sum is one entry in the product matrix AB; in fact, being the product of row 1 and column 1, the result is the 1,1 entry of AB (i.e. cell for 1st row and 1st column). Then continue in like manner. For instance, the sum of the products from row 2 of A and column 1 of B is the 2,1 entry of AB.

The final answer is:

Matrix multiplication can seem counter-intuitive, but it relates to real-world problems.
For instance,  suppose a saleman makes the following sales:
Monday: 3 T-shirts at $10 each, 4 hats at $15 each, and 1 pair of shorts at $20.
Tuesday: 4 T-shirts at $10 each, 2 hats at $15 each, and 3 pairs of shorts at $20.
If we put the cost of each item in mycosts  = [ 10 15 20]
And the quantities sold in my mysales = [3 4
                                                          4 2
                                                          1 3]
Then the total income for Mon and Tue can be computed by 
     mycosts%*%mysales  = [110 130]

Kronecker product. The right Kronecker product of two matrices A and B is formed by multiplying each element of A by the matrix B. The Kronecker product is easy to confuse with the regular matrix product, because it also uses % symbols, but this time it is %x% rather than %*%.
   let 
A= [2    1
 3    4]

B = [10.0
         0.5]

Then the Kronecker product is:
A%x%B =     
[ 20   10
  1     0.5
 30    40
  1.5  2]

Experiment with simple matrices like these to clarify how Kronecker products work. In OpenMx, one use of the Kronecker product is to simply multiply or divide all elements of a matrix by a constant, e.g. if C =.5, then the Kronecker product
A%x%C =
                [1         0 .5
                1.5        2  ]

Matrix inversion. Matrix inversion is the matrix equivalent of 1/X, or taking X-1 and is conventionally written as X~ or X-1 . The meaning is as for X-1 with a single digit, i.e. just as
4 * 4-1= 1
X * X-1 = 1, except that for a matrix, the result is the identity matrix, I, which is a matrix with zero entries except for ones on the diagonal.

The inverse of a matrix is found in R using 'solve', for example, if
                X = [ 3 1 5
                          2 0 1
                          9 2 7]
Then
solve(X) = [-0.2222222  0.3333333  0.1111111
                 -0.5555556 -2.6666667  0.7777778
                 0.4444444  0.3333333 -0.2222222]
Now multiply X by solve(X) to confirm that this leads to a 3 x 3 identity matrix. You will find that the off-diagonal values have very small but real values, rather than zero. This is due to rounding errors. N.B. remember, to do true matrix multiplication, you have to use %*%.

Determinant
A determinant is a real number associated with a square matrix, which is obtained using the 'det' function in R.
For a simple 2 x 2 matrix
     [a  b
      c d]
the determinant is ad-bc.
The determinant is involved in computations to find the inverse of a matrix, but R does this for us automatically. Unless you are very interested in matrix algebra, you need not worry about how to compute the determinant by hand, though this website gives a clear account:
Eigenvalues and eigenvectors
The command 
    eigen(X)
 gives two matrices as output.

The first is the eigenvalues. For X as defined above, these are:
[12.3300698 -1.9571103 -0.3729595]

The second matrix has the same dimensionality as X and has eigenvectors:
 [ -0.4794238 -0.6904042  0.1286079
   -0.1479209  0.3962907 -0.9855369
   -0.8650273  0.6052238  0.1103495]

A reasonably clear explanation of the meaning of eigenvalue and eigenvector comes from Andy Field's textbook, Discovering Statistics in SPSS, 2nd edition, on which the following description is based.
Start by considering the two-dimensional case. If we have two variables in a bivariate normal distribution, as shown in Figure 1, then their distribution forms an ellipse.
Figure 1: Correlated variables form an elliptical distribution
If we draw lines to measure the height and length of the the ellipse encompassing the distribution, these correspond to eigenvectors; i.e. they are orthogonal lines of a given direction. Each eigenvector has an eigenvalue that indicates its length. We can visualise scaling up to three dimensions, in which case the distribution would have the shape of an American football that could be described by three orthogonal vectors. The ratio of the largest to smallest eigenvalue tells us something about dependencies between variables: returning to the 2D example above, if X and Y were uncorrelated, then the scatterplot would have a circular form, with the two eigenvalues of equivalent size;  on the other hand, if they were so strongly correlated that all the points fell on a single line, then there would be one very large eigenvalue, corresponding to the length of the distribution, whereas the height would be of negligible size. The values of eigenvectors specify their orientation and are of less interest to us.

For a cribsheet on matrix operations in R see http://www.gseis.ucla.edu/courses/ed231a1/notes/matr.html

No comments: