The Outline of the Course

The following are two main mathematical objects that we will study in this course:[br][br][list][*]Matrices[/*][*]Vectors[/*][/list][br]They are intimately related as briefly shown in the following table:[br][br][center][/center][table][tr][td][b]Vectors[/b][/td][td][b]Matrices[/b][/td][/tr][tr][td]Vectors / points in [math]\mathbb{R}^2[/math][/td][td]2x1 column matrices[/td][/tr][tr][td]Vectors / points in [math]\mathbb{R}^3[/math][/td][td]3x1 column matrices[/td][/tr][tr][td]Vector addition[/td][td]Matrix addition[/td][/tr][tr][td]Scaling a vector[/td][td]Scalar multiplication[/td][/tr][tr][td]Linear Transformations from [math]\mathbb{R}^2[/math] to [math]\mathbb{R}^2[/math][br][/td][td]2x2 matrices[/td][/tr][tr][td]Linear Transformations from [math]\mathbb{R}^3[/math] to [math]\mathbb{R}^3[/math][br][/td][td]3x3 matrices[/td][/tr][/table][br][br]They are studied in an important branch of mathematics called "[b]linear algebra[/b]". Matrices belong to the computation side of linear algebra, whereas vectors belong to the geometric side of it. In this course, we will study both and their relationships in detail.[br][br]As you will see, we will mainly use vectors in [math] \mathbb{R}^2 [/math] or [math] \mathbb{R}^3 [/math] as examples because they can be more easily illustrated in GeoGebra applets. However, the same theory can readily be extended to higher dimensional spaces. Therefore, most of the theorems that you will see in this course are also valid in general n-dimensional spaces.[br][br][br]
Outline
The following is the outline of this course: [br][list][*]Definition of a vector and its matrix representation[/*][*]Vector addition and scaling, linear combination, span[/*][*]Linear independence, basis, dimension[/*][*]Linear transformations[/*][*]Systems of linear equations[br][/*][*]Gaussian elimination[/*][*]Solving systems of linear equations[/*][*]Computing the inverse of a matrix[br][/*][*]Determinants[/*][*]General vector spaces[/*][*]Column space and null space[/*][*]Rank theorem[/*][*]Eigenvalues and eigenvectors[/*][*]Diagonalization[/*][*]Inner product and orthogonality[/*][*]Orthogonal projections and Gram-Schmidt process[/*][*]Least square method [br][/*][/list]

What is a Vector?

Vectors are often used in physics to represent quantities like force and velocity when we need to specify both the magnitude and direction of those quantities. Usually, we represent a vector visually by an arrow. In the applet below, vectors in [math]\mathbb{R}^2[/math] and [math]\mathbb{R}^3[/math] are shown. [br][br][b]Remark[/b]: There is a very special vector that does not have any direction. What is it?
Usually, vectors are drawn as arrows pointing out from the origin (the point where all axes intersect). On a coordinate space (either 2D or 3D), the coordinates of the point at the arrowhead uniquely determine the vector. Therefore, for the vectors v in [math]\mathbb{R}^2[/math] and u in [math]\mathbb{R}^3[/math], we can express them mathematically as follows:[br][br][math]v = \begin{pmatrix} v_x \\ v_y \end{pmatrix} [/math] and [math]u = \begin{pmatrix} u_x \\ u_y \\ u_z \end{pmatrix} [/math][br][br]where [math]\left(v_x,v_y\right)[/math] and [math]\left(u_x,u_y,u_z\right)[/math] are points at the arrowhead of v and u respectively. The above notations for the vectors are called [b]column vectors[/b]. They are actually two specific kinds of [b]matrices[/b]: v and u are [b]2 x 1 matrix[/b] and [b]3 x 1 matrix[/b] respectively.[br][br](You can freely drag the points on the arrowheads of both vectors in the applet above to change the length and the direction of the vectors.)[br][br]Moreover, two vectors are regarded as equal if they have the same length and direction. Hence, sometimes we may shift the vector to another position if necessary, as long as the length and direction remain unchanged.[br][br](You can freely drag the green vectors in the applet above to any position you like and they are considered as the same vectors as the black ones pointing out from the origin.)[br]
Vector vs Point
As mentioned before, vectors are uniquely determined by the points at their arrowheads when pointing from the origin. Therefore, sometimes it is more convenient to regard vectors as points, especially when we consider not only one, but a set containing many vectors. You will see that it is easier to visualize a set of vectors as a set of points in [math]\mathbb{R}^2[/math] and [math]\mathbb{R}^3[/math].[br][br]In short, vectors and points in [math]\mathbb{R}^2[/math] and [math]\mathbb{R}^3[/math] can be used interchangeably.[br]
Higher-Dimensional Vectors
We can define vectors in [math] \mathbb{R}^n [/math], where n is any natural number, in a similar way. Obviously, any such vector can be uniquely determined by the n coordinates of its "arrowhead" when the vector is pointing out from the origin. Those n coordinates can be combined to form a column vector, which is in fact a [b]n x 1 matrix[/b] like this:[br][br][math]w=\begin{pmatrix}x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix}[/math][br][br]where [math](x_1,x_2, \ldots, x_n)[/math] are the coordinates of the point at the arrowhead of the vector w pointing from the origin in [math]\mathbb{R}^n[/math].[br][br]

Linear Transformations

A [b]transformation[/b] is simply a [b]function[/b] (or [b]mapping[/b]) [math]T[/math] from [math]\mathbb{R}^n[/math] to [math]\mathbb{R}^m[/math] i.e. for any (input) vector [math]v[/math] in [math]\mathbb{R}^n[/math], [math]T(v)[/math] is an (output) vector in [math]\mathbb{R}^m[/math]. In linear algebra, we will mainly study a very special type of transformations called [b]linear transformations[/b]. They are transformations that [u]preserve vector addition and scaling[/u] i.e. [math]T:\mathbb{R}^n\to\mathbb{R}^m[/math] is a linear transformation if for any vectors [math]u, v[/math] in [math]\mathbb{R}^n[/math] and any real number [math]k[/math], we have [br][br][math]T\left(u+v\right)=T\left(u\right)+T\left(v\right)[/math], and [math]T\left(kv\right)=kT\left(v\right)[/math][br][br]Therefore, given the standard basis [math]e_1,e_2, \ldots, e_n[/math] for [math]\mathbb{R}^n[/math], for any vector [math]v[/math] in [math]\mathbb{R}^n[/math], it can be written as a linear combination of the standard basis as follows:[br][br][math]v=c_1e_1+c_2e_2+\cdots+c_ne_n[/math][br][br]When we apply a linear transformation [math]T[/math] to it, using the fact that it preserves vector addition and scaling, we get[br][br][math]T(v)=c_1T(e_1)+c_2T(e_2)+\cdots+c_nT(e_n)[/math][br][br]You can see that [math]T(v)[/math] is the linear combination of [math]T(e_1),T(e_2),\ldots,T(e_n)[/math] with the same weight. In other words, the linear transformation [math]T[/math] is uniquely determined by [math]T(e_1),T(e_2),\ldots,T(e_n)[/math].[br][br]In the following applet, we consider any linear transformation [math]T:\mathbb{R}^2\to\mathbb{R}^2[/math]. As mentioned above, we can define [math]T[/math] by specifying [math]T(\hat{\mathbf{i}})[/math] and [math]T(\hat{\mathbf{j}})[/math]. You can change [math]T(\hat{\mathbf{i}})[/math] and [math]T(\hat{\mathbf{j}})[/math] freely, then click the "Go" button and see how the grid in the domain is "transformed" under [math]T[/math]. Also, you can define the vector [math]v[/math] by inputting the coordinates of its arrowhead. The column vector [math]T(v)[/math] will be shown. Again, you can click the "Go" button to see the transformation from [math]v[/math] to [math]T(v)[/math] visually.[br][br]
The following are some questions that test your understanding on linear transformations:
Will a grid always be transformed into a grid by any linear transformation? Explain your answer briefly.
Can a zero vector be transformed to a non-zero vector by a linear transformation? Explain your answer briefly.
Which of the following is a linear transformation from [math]\mathbb{R}^2[/math] to [math]\mathbb{R}^2[/math]? You can select more than one answer.
Let [math]T:\mathbb{R}^2\to\mathbb{R}^2[/math] such that [math]T(\hat{\mathbf{i}})=\begin{pmatrix}3 \\ -1\end{pmatrix}[/math] and [math]T(\hat{\mathbf{j}})=\begin{pmatrix}2 \\ 0\end{pmatrix}[/math]. Find [math]T\left(\begin{pmatrix} -2 \\ 3 \end{pmatrix}\right)[/math].[br](You should first try to compute for the answer without using the above applet.)
Let [math]T:\mathbb{R}^3\to\mathbb{R}^2[/math] such that [math]T(\hat{\mathbf{i}})=\begin{pmatrix}2 \\ -3\end{pmatrix}[/math], [math]T(\hat{\mathbf{j}})=\begin{pmatrix}1 \\ -4\end{pmatrix}[/math] and [math]T(\hat{\mathbf{k}})=\begin{pmatrix}0 \\ 5\end{pmatrix}[/math]. Find [math]T\left(\begin{pmatrix} 3 \\ -1 \\ 2\end{pmatrix}\right)[/math].[br]

Gaussian Elimination

What is Gaussian Elimination?
Given a system of linear equations:[br][br][math] \left\{\begin{eqnarray} a_{11}x_1+a_{12}x_2+\cdots+a_{1n}x_n & = & b_1 \\[br]a_{21}x_1+a_{22}x_2+\cdots+a_{2n}x_n & = & b_2 \\[br]\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots & \cdots & \\[br]a_{m1}x_1+a_{m2}x_2+\cdots+a_{mn}x_n & = & b_m \end{eqnarray} \right.[/math],[br][br]where [math]a_{ij}, \ i=1,\ldots,m, \ j=1,\ldots,n[/math] are real numbers. We want to find an efficient way to solve the system for the solution(s). Let us first consider a very simple example:[br][br][math]\left\{\begin{eqnarray}x_1-2x_2 &=& -1 \\ -2x_1+6x_2&=&6\end{eqnarray}\right.[/math][br][br]In high school, we already learned how to solve this system: We can multiple the first equation by 2 and add to the second equation to eliminate the variable [math]x_1[/math]. Then we get[br][br][math]\left\{\begin{eqnarray}x_1-2x_2 &=& -1 \\ 2x_2&=&4\end{eqnarray}\right.[/math][br][br]The solution to the system can be easily obtained by first solving the second equation for [math]x_2[/math] and then substitute the value of [math]x_2[/math] to the first equation to find the value of [math]x_1[/math]. Hence the solution is [math]\left(x_1,x_2\right)=\left(3,2\right)[/math].[br][br]This variable-eliminating process can be vastly generalized to deal with any system of linear equations of any size. Such generalization is called [b]Gaussian elimination[/b]. The action of using one equation to eliminate a variable in another equation is a special case of the so-called [b]elementary row operation[/b]. And the system in the last step that we can easily solve for the solution is said to be in [b]echelon form[/b]. This is the "workhorse algorithm" that all students must learn well when studying linear algebra because almost every problem in linear algebra involving computation is about solving a linear system of some kind. [br]
Augmented Matrix
For a system of linear equations, all the essential information is contained in the coefficients [math]a_{ij}[/math] and the constants [math]b_i[/math]. Therefore, we can compactly represent the system as a matrix. First of all, we have[br][br][math]A=\begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ [br]\vdots & \vdots & \ldots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix}[/math][br][br][math]A[/math] is called the [b]coefficient matrix[/b] of the system. The [math]j^{\text{th}}[/math] column contains all the coefficients of the variable [math]x_j[/math] in all the linear equations, where [math]j=1,\ldots,n[/math]. Then we can add the column of constant [math]b_i[/math] to the right in the coefficient matrix and separate it from the coefficients by a vertical line to form the [b]augmented matrix[/b] of the system:[br][br][math]\left(\begin{array}{cccc|c} a_{11} & a_{12} & \cdots & a_{1n} & b_1 \\ a_{21} & a_{22} & \cdots & a_{2n} & b_2 \\ [br]\vdots & \vdots & \ldots & \vdots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} & b_m\end{array}\right)[/math][br][br]([u]Note[/u]: The vertical line may not be drawn in augmented matrices in some textbooks.)[br][br]Gaussian elimination is an algorithm of transforming an augmented matrix into the one in echelon form.[br][br]
Exercise
Write down the augmented matrix of the following system of linear equations:[br][br][math]\left\{\begin{eqnarray} 3x_2-4x_3 + 7 &=& x_1 - 5 \\ 4-x_4-x_2 &=& 3x_2-6 \\ x_2-2x_3+4x_1 &=& x_5 \\[br]x_2-2x_5 &=& 2-x_4\end{eqnarray}\right.[/math][br][br]

Determinant and Area

Determinant of a 2 x 2 matrix
Given any 2 x 2 matrix [math]A=\begin{pmatrix}a & b\\c & d\end{pmatrix}[/math]. We know that the determinant [math]\det(A)=ad-bc[/math]. We already learned that if [math]\det(A)\ne 0[/math], [math]A[/math] is invertible. Here we study the determinant from the geometric viewpoint.[br][br]We consider the linear transformation [math]T:\mathbb{R}^2\to\mathbb{R}^2[/math] such that the matrix for [math]T[/math] is [math]A[/math] i.e. [math]T(x)=Ax[/math] for any vector [math]x[/math] in [math]\mathbb{R}^2[/math]. In the applet below, you can see how the quadrilateral CDEF is transformed by [math]T[/math]. Compare the area of the quadrateral before and after the transformation and find out the meaning of [math]\det(A)[/math].[br][br]
Suppose the quadrilateral CDEF is a unit square and [math]T[/math] is any linear transformation defined by you. What is the relationship between [math]\det(A)[/math] and the area of the transformed quadrilateral?
What can you say about the linear transformation [math]T[/math] when [math]\det(A)=0[/math]? Can you give a reason from a geometric viewpoint why [math]A[/math] is not invertible when [math]\det(A)=0[/math]?
Given two 2 x 2 matrices [math]A[/math] and [math]B[/math], it can be shown that [math]\det(AB)=\det(A)\det(B)[/math]. Can you explain why this is true from a geometric viewpoint?

Vector Spaces

Vector Spaces
We generalize the theory of vectors in [math]\mathbb{R}^n[/math] by extracting all the essential features of vector addition and scaling that make the theory works. The following is the most fundamental definition in linear algebra, which embodies all the important characteristics of vectors. [u][br][br]Definition[/u]: A [b]vector space over[/b] [math]\mathbb{R}[/math] is a nonempty set [math]V[/math] of objects, called [b]vectors[/b], on which are defined two operations, called [b]addition[/b] and [b]multiplication by scalar (real number)[/b], that satisfy the following [b]axioms[/b]: For any [math]u,v,[/math] and [math]w[/math] in [math]V[/math] and real numbers [math]c[/math] and [math]d[/math]:[br][list=1][*][math]u+v[/math] and [math]cu[/math] are in [math]V[/math] (Closed under addition and scalar multiplication)[/*][*][math]u+v=v+u[/math] (commutativity of addition)[/*][*][math]\left(u+v\right)+w=u+\left(v+w\right)[/math] (associativity of addition)[/*][*]There exists a [b]zero vector[/b] [math]0[/math] in [math]V[/math] such that [math]u+0=u[/math] (additive identity)[/*][*]For each [math]u[/math] in [math]V[/math], there exists a vector [math]-u[/math] in [math]V[/math] such that [math]u+\left(-u\right)=0[/math] (additive inverse)[/*][*][math]c\left(u+v\right)=cu+cv[/math] (distributivity for vector addition)[/*][*][math]\left(c+d\right)u=cu+du[/math] (distributivity for scalar addition)[/*][*][math]c\left(du\right)=\left(cd\right)u[/math] (Compatibility of scalar multiplication with real number multiplication)[br][/*][*][math]1u=u[/math]  (Identity of scalar multiplication)[/*][/list][br][u]Remark[/u]: More generally, we can consider scalars other than [math]\mathbb{R}[/math] in the above definition e.g. a vector space over [math]\mathbb{C}[/math].[br][br]We can deduce the following simple facts directly from the above axioms:[br][list][*]Zero vector [math]0[/math] is unique.[/*][*]For any vector [math]u[/math], [math]-u[/math] is unique. Moreover, [math]-u=\left(-1\right)u[/math].[/*][*][math]0u=0[/math] for any vector [math]u[/math]. (Note: the left zero is a real number and the right zero is the zero vector.)[/*][*][math]c0=0[/math] for any real number [math]c[/math]. [br][/*][/list]
Examples of Vector Spaces
The following are some examples of vector spaces over [math]\mathbb{R}[/math]:[br][br][br][u]Example 1[/u]: The first obvious example is [math]\mathbb{R}^n[/math].[br][br][br][u]Example 2[/u]: Polynomials[br][br]For non-negative integer [math]n[/math], [math]\mathbb{P}_n=[/math] the set of polynomials in [math]t[/math] of degree at most [math]n[/math] with real coefficients. Let [math]p(t)=a_0+a_1t+\cdots+a_nt^n[/math] and [math]q(t)=b_0+b_1t+\cdots+b_nt^n[/math] be polynomials in [math]\mathbb{P}_n[/math]. Then we define the following:[br][br][b]Addition[/b]: [math]p(t)+q(t)=(a_0+b_0)+(a_1+b_1)t+\cdots+(a_n+b_n)t^n[/math][br][br][b]Scalar multiplication[/b]: [math]cp(t)=ca_0+ca_1t+\cdots+ca_nt^n[/math] for any real number [math]c[/math][br][br]We consider such polynomials as "vectors". It can easily be shown that they satisfy all the axioms in the definition. Zero polynomial acts as the zero vector. Therefore, [math]\mathbb{P}_n[/math] is a vector space.[br][br][br][u]Example 3[/u]: Sequences of real numbers [br][br]Let [math]\mathbb{S}[/math] be the set of all real number sequences. We denote a real number sequence [math]a_1, a_2, a_3, \ldots [/math] by [math](a_n)[/math]. For any sequences [math](a_n)[/math] and [math](b_n)[/math] in [math]\mathbb{S}[/math], we define the following:[br][br][b]Addition[/b]: [math](a_n)+(b_n)=(a_n+b_n)[/math][br][br][b]Scalar multiplication[/b]: [math]c(a_n)=(ca_n)[/math] for any real number [math]c[/math][br][br]Again, it is easy to verify that [math]\mathbb{S}[/math] is a vector space.[br][br][br][u]Example 4[/u]: Matrices[br][br]Let [math]M_{m\times n}[/math] be the set of all m x n matrices. And we have already defined the addition and scalar multiplication of m x n matrices before. It is clear that [math]M_{m\times n}[/math] is a vector space. Hence, the set of all linear transformation from [math]\mathbb{R}^n[/math] to [math]\mathbb{R}^m[/math] is also a vector space.[br][br][br][u]Example 5[/u]: Real-valued functions[br][br]Let [math]V[/math] be the set of all real-valued functions defined on a set [math]D[/math]. Let [math]f:D\to \mathbb{R}[/math] and [math]g:D\to \mathbb{R}[/math] be two such functions in [math]V[/math]. Then we define the following: [br][br][b]Addition[/b]: [math](f+g)(x)=f(x)+g(x)[/math] for any [math]x[/math] in [math]D[/math][br][br][b]Scalar multiplication[/b]: Let [math]c[/math] be any real number. [math](cf)(x)=cf(x)[/math] for any [math]x[/math] in [math]D[/math][br][br]The real-valued functions can be regarded as "vectors" and it can be shown that they satisfy all the axioms in the definition of vector spaces. Therefore, [math]V[/math] is a vector space.[br][br][br][br]
Exercise
Check the box if the set is a vector space. [br][br](Note: You can check multiple boxes.)

Eigenvalues and Eigenvectors

What are Eigenvalues and Eigenvectors?
In this chapter, we mainly focus on square matrices.[br][br]Let [math]A[/math] be an n x n matrix. We want to find a special nonzero vector in [math]\mathbb{R}^n[/math] such that its transformation by [math]A[/math] is a scaling of this vector. This nonzero vector is called an [b]eigenvector[/b] and the scaling factor is called an [b]eigenvalue[/b] (note: it can be zero). More rigorously, we have the following definition:[br][br][u]Definition[/u]: An [b]eigenvector[/b] of an n x n matrix [math]A[/math] is a [u]nonzero[/u] vector [math]x[/math] such that [math]Ax=\lambda x[/math] for some real number [math]\lambda[/math], which is called an [b]eigenvalue[/b] of [math]A[/math]. [math]x[/math] is said to be an eigenvector corresponding to [math]\lambda[/math].[br][br]From the geometric point of view, it means that the line containing vector [math]x[/math] i.e.[math]\text{Span}\{x\}[/math] remains unchanged under the linear transformation corresponding to [math]A[/math]. And any vector in [math]\text{Span}\{x\}[/math] will be scaled by factor [math]\lambda[/math] under the transformation. [br][br]In the applet below, you first define the 2 x 2 matrix [math]A[/math] by setting [math]T(\hat{\mathbf{i}})[/math] and [math]T(\hat{\mathbf{j}})[/math], you need to move the vector such that it lies on the orange dotted line containing [math]Av[/math].[br][br]Complete the following tasks:[br][list=1][*]Find a matrix [math]A[/math] that has two eigenvalues.[/*][*]Find a matrix [math]A[/math] that has only one eigenvalue and all the eigenvectors lie on a line.[/*][*]Find a matrix [math]A[/math] that has only one eigenvalue and all nonzero vectors are eigenvectors.[/*][*]Find a matrix [math]A[/math] that has no eigenvalue. [br][/*][/list][br][br]
Write down the matrices that you have obtained in the above tasks.

Inner Product

Projection Map
In [math]\mathbb{R}^2[/math], we consider a unit vector [math]u[/math]. We define the [b]projection map[/b] onto the line containing vector [math]u[/math] as follows: For any vector [math]v[/math] in [math]\mathbb{R}^2[/math], [math]P_u:\mathbb{R}^2\to \mathbb{R}[/math] such that [math]P_u(v)[/math] is the [u]signed distance[/u] from the origin to the foot of the perpendicular to the line containing [math]u[/math] from the arrowhead of [math]v[/math]. The sign is positive/negative if the vector from the origin to the foot of the perpendicular is in the same/opposite direction of [math]u[/math], as shown in the applet below.[br][br]As illustrated by the applet below, [math]P_u:\mathbb{R}^2\to \mathbb{R}[/math] is in fact a linear transformation. Therefore, it can be represented by an 1 x 2 matrix. Moreover, it can be shown that [math]P_u(\hat{\mathbf{i}})=u_x[/math] and [math]P_u(\hat{\mathbf{j}})=u_y[/math], where [math]u=\begin{pmatrix}u_x\\u_y\end{pmatrix}[/math]. Hence, for any [math]v=\begin{pmatrix}v_x\\v_y\end{pmatrix}[/math], we have[br][br][math]P_u(v)=\left(u_x \ u_y\right)\begin{pmatrix}v_x\\v_y\end{pmatrix}=u_xv_x+u_yv_y[/math]
Inner Product
The definition of projection map can readily be generalized to [math]\mathbb{R}^3[/math] or even [math]\mathbb{R}^n[/math] i.e. suppose [math]u=\begin{pmatrix}u_1\\u_2\\ \vdots \\ u_n\end{pmatrix}[/math] is a unit vector in [math]\mathbb{R}^n[/math]. Then for any vector [math]v=\begin{pmatrix}v_1\\v_2\\ \vdots \\ v_n\end{pmatrix}[/math] in [math]\mathbb{R}^n[/math], we define the projection map onto the line containing [math]u[/math] as follows:[br][br][math]P_u(v)=u^Tv=\left(u_1 \ u_2 \ \cdots \ u_n\right)\begin{pmatrix}v_1\\v_2\\ \vdots \\ v_n\end{pmatrix}=u_1v_1+u_2v_2+\cdots+u_nv_n[/math][br][br]Since the above definition is symmetric in [math]u[/math] and [math]v[/math], this suggests that we should extend the definition to any vector [math]u[/math], without the restriction that [math]u[/math] is a unit vector:[br][br][u]Definition[/u]: Given any vectors [math]u,v[/math] in [math]\mathbb{R}^n[/math], the real number [math]u^Tv=\left(u_1 \ u_2 \ \cdots \ u_n\right)\begin{pmatrix}v_1\\v_2\\ \vdots \\ v_n\end{pmatrix}=u_1v_1+u_2v_2+\cdots+u_nv_n[/math] is called the [b]inner product[/b] (or [b]dot product[/b]) of [math]u[/math] and [math]v[/math], usually denoted by [math]u\cdot v[/math].[br][br][u]Remarks[/u]:[br][list][*]If [math]u\ne0[/math], then we can write [math]u=s\hat{u}[/math], where [math]\hat{u}[/math] is the unit vector in the direction of [math]u[/math] and [math]s[/math] is the length of the vector [math]u[/math]. Hence [math]u\cdot v=(s\hat{u})^Tv=s(\hat{u}^Tv)=sP_{\hat{u}}v[/math]. Moreover, for any nonzero vector [math]v[/math], [math]v[/math] is perpendicular to [math]u[/math] if and only if [math]P_{\hat{u}}v=0[/math], or equivalently, [math]u\cdot v=0[/math]. [br][/*][*]The geometric concept of the "angle between two vectors" is encoded in the inner product: Let [math]\hat{u}[/math] and [math]\hat{v}[/math] be two unit vectors in [math]\mathbb{R}^2[/math] or [math]\mathbb{R}^3[/math], then [math]\hat{u}\cdot \hat{v}=P_{\hat{u}}\hat{v}=\cos(\theta)[/math], where [math]\theta[/math] is the angle between the two vectors. Therefore, in [math]\mathbb{R}^n[/math], we can define the "angle between two vectors" through inner product.[br][/*][/list][br]The following are some basic properties of the inner product: Let [math]u,v[/math] and [math]w[/math] be vectors in [math]\mathbb{R}^n[/math], and let [math]c[/math] be any real number. Then[br][list=1][*][math]u\cdot v=v\cdot u[/math][br][/*][*][math](u+v)\cdot w=u\cdot w+v\cdot w[/math][br][/*][*][math](cu)\cdot v=c(u\cdot v)=u\cdot(cv)[/math][br][/*][*][math]u\cdot u\geq 0[/math] and [math]u\cdot u=0[/math] if and only if [math]u=0[/math][/*][/list][br]Notice that for any vector [math]u[/math] in [math]\mathbb{R}^2[/math] (or [math]\mathbb{R}^3[/math]), [math]u\cdot u=u_x^2+u_y^2[/math] (or [math]u\cdot u=u_x^2+u_y^2+u_z^2[/math]), which is the square of the length of the vector. Hence, we can generalize this definition to vectors in [math]\mathbb{R}^n[/math] as follows:[br][br][u]Definition[/u]: The [b]length[/b] (or [b]norm[/b]) of vector [math]v[/math] in [math]\mathbb{R}^n[/math] is the nonnegative real number [math]\|v\|[/math] defined by[br][br][math]\|v\|=\sqrt{v\cdot v}=\sqrt{v_1^2+v_2^2+\cdots+v_n^2}[/math][br][br]Given any nonzero vector [math]v[/math], we can compute the unit vector [math]\hat{v}[/math] in the direction of [math]v[/math] as follows:[br][br][math]\hat{v}=\frac1{\|v\|}v[/math][br][br][br]For any two vectors [math]u,v[/math] in [math]\mathbb{R}^n[/math]. We can measure the [b]distance[/b] between the arrowheads of the two vectors by finding the length of the vector from one arrowhead to another i.e. the distance is [math]\|u-v\|[/math].[br][br][br]
Exercise
Let [math]u=\begin{pmatrix}2\\3\\-1\end{pmatrix}[/math] in [math]\mathbb{R}^3[/math]. [br][list=1][*]Find the unit vector [math]\hat{u}[/math].[/*][br][*]Let [math]L[/math] be the line through the origin containing [math]u[/math]. Using the inner product, find the perpendicular distance from the point [math](-2,0,7)[/math] to the line [math]L[/math].[br][/*][/list][br]
Suppose [math]u,v[/math] are two vectors in [math]\mathbb{R}^n[/math].[br][list=1][*]Prove that [math]\|u-v\|^2=\|u\|^2+\|v\|^2-2u\cdot v[/math]. (Hint: write the norms in terms of inner products)[/*][br][*]Using (1), prove that [math]u\cdot v=\frac14\left(\|u+v\|^2-\|u-v\|^2\right)[/math]. (Note: This is called the [b]polarization identity[/b]. Using it, we can define the inner product in terms of the norm)[/*][br][*]Using (1), prove the law of cosine for triangles on a plane.[/*] [br][/list]

Symmetric Matrices

A square matrix [math]A[/math] is called a [b]symmetric matrix[/b] if [math]A^T=A[/math].[br][br]It has many nice properties that can be summarized into the following theorem:[br][br][u]Spectral Theorem for symmetric matrices[/u]: For any [math]n\times n[/math] symmetric matrix [math]A[/math], we have[br][list=1][*][math]A[/math] has [math]n[/math] real eigenvalues (counting multiplicities).[/*][br][*] Any two eigenvectors corresponding to two distinct eigenvalues are orthogonal.[/*][br][*] The dimension of each eigenspace equals the multiplicity of the corresponding eigenvalue.[/*][br][*] [math]A[/math] is diagonalizable. Moreover, we can choose an orthogonal matrix [math]P[/math] such that [math]A=PDP^{-1}=PDP^T[/math], where [math]D[/math] is a diagonal matrix.[/*][br][/list]The proof of spectral theorem is beyond the scope of this course. We can only prove (2) as follows:[br]Suppose [math]u[/math] and [math]v[/math] are eigenvectors of [math]A[/math] corresponding to eigenvalues [math]\lambda[/math] and [math]\mu[/math] respectively. And we assume [math]\lambda\not=\mu[/math]. Then we have the following:[br][br][math]\lambda (u\cdot v)=\lambda u\cdot v =(Au)\cdot v=(Au)^Tv=u^TA^Tv=u^T(Av)=u^T(\mu v)=\mu u\cdot v=\mu(u\cdot v)[/math][br][math]\Rightarrow (\lambda-\mu)(u\cdot v)=0[/math][br]Since [math]\lambda\not=\mu[/math], [math]u\cdot v=0[/math], which implies that [math]u[/math] and [math]v[/math] are orthogonal.[br] [br][br]
In the applet below, you can visualize the diagonalization of a [math]2\times 2[/math] symmetric matrix.

Information