This generalized inverse matrix was reinvented by Penrose at the age of 24 | N-word rough linear algebra

This generalized inverse matrix was reinvented by Penrose at the age of 24 | N-word rough linear algebra

Many students feel confused and painful when they first learn linear algebra, and they cannot appreciate the practical significance of the course. This is largely because the textbooks must start from basic abstract concepts in order to go from the shallow to the deep and step by step, and the truly intuitive part often has to wait until the later subdivisions or specific applications. Therefore, beginners often know the results but not the reasons; they only see the trees but not the forest. I hope this article can help you change your perspective and see a different linear algebra with a relaxed and interesting daily perspective.

This article is the fifth in a series of articles titled "A rough guide to linear algebra in N articles". In the previous article, we discussed how to deal with the situation where a system of linear equations has no solution or has infinite solutions. When A is a full-rank square matrix, the linear equation

Written by | Wu Jinyuan

In the last chapter, a nearsighted otaku went downstairs to a breakfast shop to buy breakfast one day. He forgot his glasses at home and couldn't see the price list on the blackboard. So, while waiting in line, the otaku listened to the number of breakfast items purchased by the customer in front of him and the total price reported by the waitress, and calculated the unit price of various breakfast items based on this.

The otaku bought breakfast and ate while thinking. We know that only square matrices, that is, matrices with the same number of rows and columns, can have inverse matrices. However, even square matrices do not necessarily have inverse matrices. Because for a matrix to have an inverse, it must be square and full rank. However, in the real world, we often encounter matrices that are not square or matrices that are not (column) full rank. In this case, can we only stare blankly and be at a loss?

This question itself is a real problem. When solving a system of linear equations, although we may not be able to find the only solution to the equation, we hope to understand the relationship between the unknowns through this system of equations. This requires us to find the generalized inverse matrix of an arbitrary matrix.

(1) Real mathematics taught by fake art teachers

For an arbitrary matrix A , no matter whether it is square or not, and whether its rank is full or not, if there is a matrix G that satisfies the condition AGA = A , then G is called a generalized inverse matrix of A. If both conditions AGA = A and GAG = G are met at the same time, then G is called a reflexive generalized inverse matrix. Let's discuss an example below.

The role of matrix can be explained by coordinate transformation. In fact, painting and photography can be regarded as coordinate transformation. Of course, if I say this, artists will pick up a broom and chase me out, so here we need to limit what we are talking about to fake art teachers like me. For example, one day, I drew an apple.

My drawing skills are very poor, but if a trained painter were to draw something very similar like the one below, it should not be difficult.

This picture is actually photoshopped, and is used here to show that some painters can indeed draw it so similarly.

We can think of photography as a matrix: A. Its function is to transform physical objects ( x ) or paintings ( v ) into photographs ( u ).

Objects become photos: u = Ax

Painting becomes photo: u = Av

Similarly, we can think of a painting as a matrix: G. Its function is to turn a physical object or a photo into a painting.

Objects become paintings: v = Gx

Photos become paintings: v = Gu

At the scene, it is not difficult to see that the paintings and the real objects are very different. Similarly, the photos and the real objects, and the photos (photos taken at a certain angle) and the paintings are also different.

However, we can do the following:

In this process, if Photo 1 is the same as Photo 2, the painting matrix G can be regarded as the generalized inverse matrix of photography A , because they satisfy AGA = A.

Similarly, we can do the following:

If drawings 1 and 2 are the same, then GAG = G.

I want to emphasize here that painting is obviously not the inverse matrix of photography, because we cannot draw a real apple from a photo, even a 3D model of an apple. But painting is the generalized inverse matrix of photography, because after turning a photo into a painting through painting, and then turning the painting into a photo through photography, the two photos are the same.

Similarly, photography is not the inverse matrix of painting, but it is its generalized inverse matrix.

Both are the generalized inverse of each other, and both are reflexive.

(2) Finding a reliable generalized inverse matrix

Now, let's change the math teacher's angle to discuss this. What conditions must the generalized inverse matrix satisfy? As mentioned before, for a matrix A , if there is another matrix G that satisfies the condition AGA = A , G is a generalized inverse matrix of A. We still use the example of the myopic otaku buying breakfast to illustrate the definition of the generalized inverse matrix.

For example, assume there are only two customers in front of the otaku, as shown in the following table.

These two transactions form a linear equation system, which can be written as follows using matrix multiplication:

Obviously, we have three unknowns here, but only two equations. The constraints are not sufficient and there is no unique solution. From the coefficient matrix, this matrix is ​​short and fat, and obviously not full rank, so this matrix has no corresponding inverse matrix. But we can define the generalized inverse matrix of this matrix.

This definition can be drawn as the following diagram.

The breakfast shop has three kinds of food, and the three unit prices form a vector x . The quantities of different kinds of food purchased by two customers form a 2-row 3-column matrix A. The total price of the two customers' shopping forms a vector y with two elements. The calculation process of the total price is a matrix multiplication Ax = y , which is shown in the first row of the figure.

We start with the total shopping price y of the two customers. Although we cannot calculate the exact unit price of the three foods, we can calculate a set of possible unit prices u , as shown in the second row of the figure above. This algorithm is to use the two elements in y as a linear combination. This linear combination can be regarded as a matrix multiplication: u = G y , where G is the generalized inverse matrix of A , which can be calculated from A. Although this calculation is more complicated, we will not discuss it for the time being. Note that the possible unit price u calculated here may be completely different from the unit price x we ​​know in advance, but this set of possible unit prices should be reliable.

But what does "reliable" mean? Let's explain it through the third line of the above figure. Simply put, Au = y . In other words, when we use this new set of possible unit prices to replace the real unit prices we know in advance to calculate the total shopping price of the first two customers, the result must remain unchanged. Specifically in our example, the unit prices we know in advance are: 3 yuan for oil cakes, 4 yuan for tea eggs, and 7 yuan for tofu pudding. Using the generalized inverse matrix, we may calculate another set of possible unit prices: 3 yuan for oil cakes, 5.5 yuan for tea eggs, and 5.5 yuan for tofu pudding. This set of possible unit prices is obviously not the real unit prices we know in advance, but the total shopping price of the two customers calculated by this set of possible unit prices is unchanged. In fact, when the unit price of tea eggs plus the unit price of tofu pudding equals 11 yuan, we always get the same result.

(3) There may be more than one generalized inverse matrix

We require the generalized inverse matrix to be "reliable", or AGA = A , which is a very loose condition. Because this condition is very loose, for a matrix A , there may be infinite matrices G that meet the condition. Therefore, a matrix may have infinite generalized inverse matrices.

Of course, if the matrix A is a full-rank square matrix, then it has only one generalized inverse matrix, and this generalized inverse matrix is ​​the inverse matrix we have learned before.

In AGA = A , it is true that the product ( GA ) is a bit like the identity matrix, because when it is multiplied with A, it will also get A : A ( GA ) = A , but ( GA ) is usually not the identity matrix, it can usually only restore A but not other matrices.

(4) Picking and choosing in the generalized inverse matrix

Although the matrix G that satisfies AGA = A is a "reliable" generalized inverse matrix, there are generally countless matrices G that satisfy this condition. For example, in the previous example, as long as a matrix G can calculate that the unit price of tea eggs plus the unit price of tofu pudding is equal to 11 yuan, it can be considered a generalized inverse matrix.

We naturally wonder if we can narrow the range of generalized inverse matrices we choose, preferably to just one?

We humans have always been like this, we get anxious when we don't have anything, and we also get anxious when we have too much. For example, when a rich lady is looking for a son-in-law, she first puts forward the condition that he must be tall, and when there are many people who meet the condition, she puts forward the condition that he must be from a wealthy family, and then the condition that he must be handsome.

In fact, the idea of ​​narrowing the range of the generalized inverse matrix is ​​the same. There is nothing that cannot be achieved by adding restrictions. If it doesn't work, just add a few more conditions.

The condition that comes to mind most easily is GAG = G , which is symmetrical to AGA = A in appearance. What it means can be illustrated by the following figure.

In the example of buying breakfast, we can use the generalized inverse matrix to calculate the unit price of three kinds of food according to the payment amount y' of two customers. Although this unit price is not the same as the real unit price, it can provide some valuable information. For example, in theory, if the payment amount of two customers today is lower than yesterday, we can guess that there is a discount today and the unit price of food has been reduced.

Using one of the many generalized inverse matrices in our hands, the myopic otaku can calculate today's food unit price u = Gy' , as shown in the first row of the figure above. Of course, this set of unit prices is not necessarily correct. Using this set of unit prices, the waitress can calculate the payment amount of the two customers: Au = y , as shown in the second row of the figure above. Note that this set of payment amount y and the y' heard by the myopic otaku may be different. Now if the myopic otaku calculates the food unit price u = Gy in reverse, as shown in the third row of the figure above, in general, he will not necessarily get the same result. However, since there are generally countless generalized inverse matrices, if we choose a suitable generalized inverse matrix G at the beginning , which satisfies GAG = G , then the first and third rows of the figure above will both calculate the same unit price.

We can see that the two conditions AGA = A and GAG = G are symmetrical, and the pair of matrices A and G that meet these two conditions are generalized inverse matrices of each other, and they are reflexive to each other. In some literature, the matrix G that meets the condition AGA = A is called the inner inverse of A ; the matrix G that meets the condition GAG = G is called the outer inverse of A.

With the two conditions AGA = A and GAG = G , the number of generalized inverse matrices that meet these two conditions is obviously reduced, but what is maddening is that for a general matrix A , we may still have countless G. This requires us to further increase the restrictions, which we will discuss further later.

(5) Pick out the only generalized inverse matrix

Adding restrictions can narrow our search scope. It is conceivable that if the selected conditions are appropriate, we may be able to pick out only one generalized inverse matrix from countless generalized inverse matrices. Of course, we must also ensure that the conditions we set are not too harsh, because then there may be an embarrassing situation where no generalized inverse matrix meets all the conditions. Fortunately, we now have at least one set of conditions to ensure that one and only one generalized inverse matrix is ​​selected.

Now, it's time for Penrose to shine. For any matrix A , there exists one and only one matrix G that satisfies the following four conditions. This matrix G is called the Penrose inverse matrix of matrix A.

It can be seen that among these four conditions, the first one is the definition of the generalized inverse matrix, and the second one is the relational expression of the reflexive generalized inverse matrix we just discussed. In the third and fourth relations, "*" means conjugate transposition, which means replacing all elements with conjugate complex numbers and swapping the rows and columns of the elements. If all elements of the matrix in the brackets are real numbers, the conjugate transposition "*" is equivalent to the transposition "T", or the product of the two matrices in the brackets AG or GA is symmetric. (Here, readers are reminded that A is usually not square, so G is usually not square. But AG and GA are square, and these two square matrices are generally one larger and the other smaller.) The third condition allows the generalized inverse matrix to provide the least squares solution. G , which satisfies the first and third conditions at the same time, can use Gy to calculate the least squares solution when Ax = y is self-contradictory and has no solution. (When the original system of equations is not self-contradictory, Gy is the solution of the system of equations). The fourth condition allows the generalized inverse matrix to provide the minimum norm solution. We know that if the original system of equations is not sufficiently constrained, it may have infinite solutions. Now the role of the fourth condition is to pick out the minimum norm solution.

There are almost no living people whose names have been included in university textbooks, but Roger Penrose is still alive and well. He won the Nobel Prize in Physics in 2020 for his work on black holes.

Roger Penrose | Nobel Prize Outreach. Photo: Fergus Kennedy

Penrose discovered the Penrose inverse in 1955. Actually, this inverse matrix had been discovered by predecessors more than 30 years ago. However, Penrose didn't know about it at the time. In a sense, it can be regarded as reinventing wheels.

Of course, Penrose cannot be blamed for this, because many people in the mathematics community did not know about it at the time. In fact, before Penrose, in 1951, Swedish geodesist Arne Bjerhammar had reinvented the wheel. When Penrose submitted his paper, it is at least conceivable that the reviewers of his paper did not know the work of his predecessors, otherwise the paper might not have been published. You may think which editor dares to reject the paper of a Nobel Prize winner? But Nobel Prize winners also have their own youth. Penrose won the Nobel Prize more than 60 years later, when he was just a 24-year-old doctoral student.

The mathematician who discovered the generalized inverse matrix in 1920 was Eliakim Hastings Moore. He used the projection of the matrix on the row-column subspace to express his discovery. It was very abstract and difficult to understand, so not many people continued to study it. I did not find the original text of his 1920 paper, but I read a paraphrase of it by later generations. When people describe an article as difficult to understand, they say that they know all the words but do not understand the meaning. However, I did not even recognize some of the letters in this article. I searched the Greek, Hebrew, and Cyrillic alphabets on the Internet but could not find anything. Finally, I found that the unknown letter was a kind of cursive Latin letter.

Moore | wikipedia

A few years later, people knew that Penrose's inverse and Moore's inverse were equivalent, but Penrose and Moore's expressions were very different. In addition, Penrose's four conditions can be separated and mixed with each other, which can reveal more novel properties of the generalized inverse matrix. Therefore, the mathematical community fully recognized Penrose's new contribution, and people now call this inverse the Moore-Penrose inverse. We will discuss this further later. (To be continued)

Special Tips

1. Go to the "Featured Column" at the bottom of the menu of the "Fanpu" WeChat public account to read a series of popular science articles on different topics.

2. Fanpu provides a function to search articles by month. Follow the official account and reply with the four-digit year + month, such as "1903", to get the article index for March 2019, and so on.

Copyright statement: Personal forwarding is welcome. Any form of media or organization is not allowed to reprint or excerpt without authorization. For reprint authorization, please contact the backstage of the "Fanpu" WeChat public account.

<<:  Asia Winter Science Popularization Issue 16丨Alpine Skiing: Dynamic Symphony of Force and Energy

>>:  Coffee causes cancer? Don’t panic! This article will help you sort out the love-hate relationship with coffee

Recommend

iPad mini 4 processor and memory confirmed: really good!

The iPad mini 4 is the most low-key of the bunch ...

Technology News | Japan uses supercomputer to predict six quark particles

【Today’s cover】 With the arrival of winter, Tangg...

Gaia-0 Basic Beginner Short Video Monetization Training Camp

Gaia-0 Basics Novice Short Video Monetization Tra...