Yet Another Basic Question on Linear Transformations and Their Matrices

Math Amateur · Feb 14, 2016

I am revising the basics of linear transformations and trying to get a thorough understanding of linear transformations and their matrices ... ...

At present I am working through examples and exercises in Seymour Lipshutz' book: Linear Algebra, Fourth Edition (Schaum Series) ... ...

At present I am focused on Chapter 6: Linear Mappings and Matrices ...

I need help with an aspect of Example 6.1 on page 196 ...

Example 6.1 reads as follows:View attachment 5279Now in Example 6.1 (a), (1) above, Lipshutz determines $\displaystyle F(u_1)$ as follows:$\displaystyle F(u_1) = F( \begin{bmatrix} 1 \\ 2 \end{bmatrix} ) = \begin{bmatrix} 8 \\ -6 \end{bmatrix}$and then Lipshutz goes on to find the coordinates $\displaystyle x$ and $\displaystyle y$ of $\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix}$ relative to the basis $\displaystyle \{ u_1, u_2 \}$ ... ...... ... BUT ... ... what is $\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix}$ exactly ... ...To answer my own question ... ... I suspect it is the coordinates of a point relative to the standard basis $\displaystyle e_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, e_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}$ ... ... is that right? Have I described it correctly?

So, if I am right ...

$\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix} = 8e_1 + (-6)e_2$Can someone please confirm that the above analysis of what is going on is correct ... or alternatively point out errors and shortcomings in what I have said ...

Peter

Euge · Feb 26, 2016

Hi Peter,

Lipshutz writes "For notational convenience, we use column vectors." This means that he'll use the column vector notation $\begin{bmatrix}x\\y\end{bmatrix}$ to mean the point $(x,y)$ for a point in $\Bbb R^2$.

Math Amateur · Feb 26, 2016

Euge said:

Hi Peter,

Lipshutz writes "For notational convenience, we use column vectors." This means that he'll use the column vector notation $\begin{bmatrix}x\\y\end{bmatrix}$ to mean the point $(x,y)$ for a point in $\Bbb R^2$.

Thanks Euge ...

I think from what you are saying is that $\displaystyle \begin{bmatrix} 8 \\ -6 \end{bmatrix}$ is a vector (point in $\displaystyle \mathbb{R}^2$ ) in the standard basis $\displaystyle e_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, e_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}$ ...

Maybe I am blurring the distinction between points and vectors a bit ...

Peter

Deveno · Feb 27, 2016

Vectors are elements of a vector space. While that seems almost tautological, it's the most accurate description.

Some people think of vectors as "arrows" (so they START at a point, and "go somewhere for a distance"). This isn't quite accurate-for we can imagine two such arrows that start at *different* points, and then we have no way to ADD them.

However, we CAN add all such arrows that start at a GIVEN point, by a purely geometric process called the "parallelogram rule". This process, of "choosing a point" turns our geometric space (more properly an AFFINE space) into a vector space, if we agree to identify the chosen point as our "origin".

However, we may already have an origin, for example we may already have a coordinate-space used to describe a curve, or a surface, and we wish to describe our chosen point in terms of the coordinate system we are describing our curve or surface in. So now we have TWO coordinate systems-relative to our original origin, and relative to our chosen point. This is, in somewhat loose terms, the difference between $\Bbb R^n$ and $(\Bbb R^n)_p$.

So if in our original coordinate system, we have $p = (c_1,c_2,c_3)$, then in our SECOND coordinate system (relative to $p$) we have: $p = (0,0,0)_p$.

I bring this up to emphasize that coordinates ALONE do not tell us "which point" (element of a vector space) we have. WE need two more pieces of information: "relative to what (origin) point", and "using what coordinate system".

In their rush to give students some basic faculty with vectors and matrices, many textbooks ignore these niceties lurking in the background. In other words, they ASSUME that the basis $\{(1,0,\dots,0),(0,1,\dots,0),\dots,(0,0,\dots,1)\} = \{e_i\}$ will be used relative to the "usual origin" (the 0-vector with all 0 coordinates), so that the $n$-tuple of (real numbers, for example) $(c_1,c_2,\dots,c_n)$ will MEAN $c_1e_1 + c_2e_2 + \cdots + c_ne_n$.

Now, there is a NATURAL isomorphism between, $\Bbb R^2$, say, and the space $\text{Mat}_{2 \times 1}(\Bbb R)$, given by:

$(x,y) \mapsto \begin{bmatrix}x\\y\end{bmatrix}$.

It is this isomorphism Lipshutz is tacitly referring to when he says "for notational convenience, we will use column vectors". Mathematically speaking, it would be fine to have $\{e_1,e_2\}$ be ANY two orthonormal vectors, we would get the same set of 2x1 column vector representations, but unless we actually knew what coordinates $e_1,e_2$ had in our "base coordinate system" we would not "know" which "point" in $\Bbb R^2$ any given matrix REFERRED to.

There is yet a further wrinkle in all this-orthognality and normality depend on a given inner product, and a given norm. Many times, authors of linear algebra texts tacitly assume there is a natural reason to assume the Euclidean inner product:

$\langle (x_1,y_1),(x_2,y_2)\rangle = x_1x_2 + y_1y_2$

but many,many inner products are possible.

Given an inner product, one can DEFINE a norm by: $\|(x,y)\| = \sqrt{\langle(x,y),(x,y)\rangle}$, which leads to the familiar formula:

$\|(x,y)\| = \sqrt{x^2 + y^2}$ when the Euclidean inner product is used. This, of course, is the usual "distance" formula that has its origins in Pythagoras' Theorem.

HOWEVER, one can define "distance" WITHOUT having first defined an inner product, for example, there is the "discrete distance function":

$d((x_1,y_1),(x_2,y_2)) = 1$ if $(x_1,y_1) \neq (x_2,y_2)$
$d((x_1,y_1),(x_1,y_1)) = 0$

that returns a distance of 1, if two points are different, and a distance of 0 if they are the same.

The point is, in arenas of greater mathematical sophistication, "points" don't always have some of the "nice" properties we take for granted in a Euclidean plane.

There is a slight danger of saying the column vector $\begin{bmatrix}x\\y\end{bmatrix}$ IS the point $(x,y)$. What is more ACCURATE to say, is that that column matrix is a REPRESENTATION of the point $(x,y)$. When, in mathematics, you see the word "representation", you should immediately think inside: "Oh, so there's a homomorphism of some kind involved".

Loosely speaking, however, in the same sense that Lipshutz is, you are correct. You (and he) are both blurring these fine distinctions (this is all very well and good, until one has "multiple coordinate systems you are switching between", and then it pays to keep them straight in your mind).

Math Amateur · Feb 27, 2016

Deveno said:

Vectors are elements of a vector space. While that seems almost tautological, it's the most accurate description.

Some people think of vectors as "arrows" (so they START at a point, and "go somewhere for a distance"). This isn't quite accurate-for we can imagine two such arrows that start at *different* points, and then we have no way to ADD them.

However, we CAN add all such arrows that start at a GIVEN point, by a purely geometric process called the "parallelogram rule". This process, of "choosing a point" turns our geometric space (more properly an AFFINE space) into a vector space, if we agree to identify the chosen point as our "origin".

However, we may already have an origin, for example we may already have a coordinate-space used to describe a curve, or a surface, and we wish to describe our chosen point in terms of the coordinate system we are describing our curve or surface in. So now we have TWO coordinate systems-relative to our original origin, and relative to our chosen point. This is, in somewhat loose terms, the difference between $\Bbb R^n$ and $(\Bbb R^n)_p$.

So if in our original coordinate system, we have $p = (c_1,c_2,c_3)$, then in our SECOND coordinate system (relative to $p$) we have: $p = (0,0,0)_p$.

I bring this up to emphasize that coordinates ALONE do not tell us "which point" (element of a vector space) we have. WE need two more pieces of information: "relative to what (origin) point", and "using what coordinate system".

In their rush to give students some basic faculty with vectors and matrices, many textbooks ignore these niceties lurking in the background. In other words, they ASSUME that the basis $\{(1,0,\dots,0),(0,1,\dots,0),\dots,(0,0,\dots,1)\} = \{e_i\}$ will be used relative to the "usual origin" (the 0-vector with all 0 coordinates), so that the $n$-tuple of (real numbers, for example) $(c_1,c_2,\dots,c_n)$ will MEAN $c_1e_1 + c_2e_2 + \cdots + c_ne_n$.

Now, there is a NATURAL isomorphism between, $\Bbb R^2$, say, and the space $\text{Mat}_{2 \times 1}(\Bbb R)$, given by:

$(x,y) \mapsto \begin{bmatrix}x\\y\end{bmatrix}$.

It is this isomorphism Lipshutz is tacitly referring to when he says "for notational convenience, we will use column vectors". Mathematically speaking, it would be fine to have $\{e_1,e_2\}$ be ANY two orthonormal vectors, we would get the same set of 2x1 column vector representations, but unless we actually knew what coordinates $e_1,e_2$ had in our "base coordinate system" we would not "know" which "point" in $\Bbb R^2$ any given matrix REFERRED to.

There is yet a further wrinkle in all this-orthognality and normality depend on a given inner product, and a given norm. Many times, authors of linear algebra texts tacitly assume there is a natural reason to assume the Euclidean inner product:

$\langle (x_1,y_1),(x_2,y_2)\rangle = x_1x_2 + y_1y_2$

but many,many inner products are possible.

Given an inner product, one can DEFINE a norm by: $\|(x,y)\| = \sqrt{\langle(x,y),(x,y)\rangle}$, which leads to the familiar formula:

$\|(x,y)\| = \sqrt{x^2 + y^2}$ when the Euclidean inner product is used. This, of course, is the usual "distance" formula that has its origins in Pythagoras' Theorem.

HOWEVER, one can define "distance" WITHOUT having first defined an inner product, for example, there is the "discrete distance function":

$d((x_1,y_1),(x_2,y_2)) = 1$ if $(x_1,y_1) \neq (x_2,y_2)$
$d((x_1,y_1),(x_1,y_1)) = 0$

that returns a distance of 1, if two points are different, and a distance of 0 if they are the same.

The point is, in arenas of greater mathematical sophistication, "points" don't always have some of the "nice" properties we take for granted in a Euclidean plane.

There is a slight danger of saying the column vector $\begin{bmatrix}x\\y\end{bmatrix}$ IS the point $(x,y)$. What is more ACCURATE to say, is that that column matrix is a REPRESENTATION of the point $(x,y)$. When, in mathematics, you see the word "representation", you should immediately think inside: "Oh, so there's a homomorphism of some kind involved".

Loosely speaking, however, in the same sense that Lipshutz is, you are correct. You (and he) are both blurring these fine distinctions (this is all very well and good, until one has "multiple coordinate systems you are switching between", and then it pays to keep them straight in your mind).

Thanks Deveno ... just working through your post now ...

Peter

Math Amateur · Feb 27, 2016

Deveno said:

Vectors are elements of a vector space. While that seems almost tautological, it's the most accurate description.

Some people think of vectors as "arrows" (so they START at a point, and "go somewhere for a distance"). This isn't quite accurate-for we can imagine two such arrows that start at *different* points, and then we have no way to ADD them.

However, we CAN add all such arrows that start at a GIVEN point, by a purely geometric process called the "parallelogram rule". This process, of "choosing a point" turns our geometric space (more properly an AFFINE space) into a vector space, if we agree to identify the chosen point as our "origin".

However, we may already have an origin, for example we may already have a coordinate-space used to describe a curve, or a surface, and we wish to describe our chosen point in terms of the coordinate system we are describing our curve or surface in. So now we have TWO coordinate systems-relative to our original origin, and relative to our chosen point. This is, in somewhat loose terms, the difference between $\Bbb R^n$ and $(\Bbb R^n)_p$.

So if in our original coordinate system, we have $p = (c_1,c_2,c_3)$, then in our SECOND coordinate system (relative to $p$) we have: $p = (0,0,0)_p$.

I bring this up to emphasize that coordinates ALONE do not tell us "which point" (element of a vector space) we have. WE need two more pieces of information: "relative to what (origin) point", and "using what coordinate system".

In their rush to give students some basic faculty with vectors and matrices, many textbooks ignore these niceties lurking in the background. In other words, they ASSUME that the basis $\{(1,0,\dots,0),(0,1,\dots,0),\dots,(0,0,\dots,1)\} = \{e_i\}$ will be used relative to the "usual origin" (the 0-vector with all 0 coordinates), so that the $n$-tuple of (real numbers, for example) $(c_1,c_2,\dots,c_n)$ will MEAN $c_1e_1 + c_2e_2 + \cdots + c_ne_n$.

Now, there is a NATURAL isomorphism between, $\Bbb R^2$, say, and the space $\text{Mat}_{2 \times 1}(\Bbb R)$, given by:

$(x,y) \mapsto \begin{bmatrix}x\\y\end{bmatrix}$.

It is this isomorphism Lipshutz is tacitly referring to when he says "for notational convenience, we will use column vectors". Mathematically speaking, it would be fine to have $\{e_1,e_2\}$ be ANY two orthonormal vectors, we would get the same set of 2x1 column vector representations, but unless we actually knew what coordinates $e_1,e_2$ had in our "base coordinate system" we would not "know" which "point" in $\Bbb R^2$ any given matrix REFERRED to.

There is yet a further wrinkle in all this-orthognality and normality depend on a given inner product, and a given norm. Many times, authors of linear algebra texts tacitly assume there is a natural reason to assume the Euclidean inner product:

$\langle (x_1,y_1),(x_2,y_2)\rangle = x_1x_2 + y_1y_2$

but many,many inner products are possible.

Given an inner product, one can DEFINE a norm by: $\|(x,y)\| = \sqrt{\langle(x,y),(x,y)\rangle}$, which leads to the familiar formula:

$\|(x,y)\| = \sqrt{x^2 + y^2}$ when the Euclidean inner product is used. This, of course, is the usual "distance" formula that has its origins in Pythagoras' Theorem.

HOWEVER, one can define "distance" WITHOUT having first defined an inner product, for example, there is the "discrete distance function":

$d((x_1,y_1),(x_2,y_2)) = 1$ if $(x_1,y_1) \neq (x_2,y_2)$
$d((x_1,y_1),(x_1,y_1)) = 0$

that returns a distance of 1, if two points are different, and a distance of 0 if they are the same.

The point is, in arenas of greater mathematical sophistication, "points" don't always have some of the "nice" properties we take for granted in a Euclidean plane.

There is a slight danger of saying the column vector $\begin{bmatrix}x\\y\end{bmatrix}$ IS the point $(x,y)$. What is more ACCURATE to say, is that that column matrix is a REPRESENTATION of the point $(x,y)$. When, in mathematics, you see the word "representation", you should immediately think inside: "Oh, so there's a homomorphism of some kind involved".

Loosely speaking, however, in the same sense that Lipshutz is, you are correct. You (and he) are both blurring these fine distinctions (this is all very well and good, until one has "multiple coordinate systems you are switching between", and then it pays to keep them straight in your mind).

Well ! ... ... indeed ... that was so helpful ...

Thanks!

Peter

Yet Another Basic Question on Linear Transformations and Their Matrices

Related to Yet Another Basic Question on Linear Transformations and Their Matrices

1. What is a linear transformation?

2. What is a matrix representation of a linear transformation?

3. How do you determine the matrix representation of a linear transformation?

4. Can a linear transformation have multiple matrix representations?

5. What is the significance of the standard matrix of a linear transformation?

Similar threads

Hot Threads

Recent Insights