Vectors as Instructions

For many of us, it’s hard to imagine how the standard blackboard construction of lines and points in the plane can hold any mystery or subtlety. What could be simpler than drawing a line in two dimensions and decorating it with some points, each bearing a label? However, many of the intuitive notions that we have about things aren’t sufficiently well-defined to survive the transition from casual use to rigorous proof. This is especially true in all things associated with getting a computer to perform a set of functions for us; the first step in building good software often being the sharpening of fuzzy ideas. The human capacity to ‘figure it out’ has yet to be designed into a set of instructions.

The issue that arises in constructing a line with points in the plane is in understanding the distinction between the ordered pair of numbers used to denote a given point and the ordered pair of numbers used to denote a vector. Take the line $\ell$ in the plane.

line and plane
The points ${\mathcal P}$ and ${\mathcal Q}$ fall on the line and have coordinates given by the ordered pairs


\[ {\mathcal P} = \left( \begin{array}{c} x_1 \\ y_1 \end{array} \right) \]

and

\[ {\mathcal Q} = \left( \begin{array}{c} x_0 \\ y_0 \end{array} \right) \; . \]

Note that there is nothing special about writing the points in terms of column arrays, rather than the more usual $(x_i, y_i)$ notation. The reason for doing this is a matter of notation convenience that should become clear below.

Intuitively, we understand that if the vector $\vec v$ stretches from ${\mathcal Q}$ to ${\mathcal P}$ it components are given by

\[  [\vec v]_x \equiv v_x = (x_1 – x_0) \]

and

\[  [\vec v]_y \equiv v_y = (y_1 – y_0) \; , \]

where the notation $[\vec v]_i$ should be read as ‘get the ith component of the vector $\vec v$’.

At this point, there is a strong temptation to write the vector $\vec v$ in the same fashion

\[ \vec v = \left( \begin{array}{c} x_1 – x_0 \\ y_1 – y_0 \end{array} \right)  \]

as we did the points ${\mathcal P}$ and ${\mathcal Q}$.

This approach certainly provides some benefits. That the notational forms for points and vectors look the same fits the visual picture of $\vec v$ connecting ${\mathcal Q}$ to ${\mathcal P}$. But the cons of such an approach outweigh the benefits. An individual point on the line is a zero-dimensional object whose address in space is given by the ordered pair. The vector is a one-dimensional object, a portion of the line that behaves like a directed line segment.

In addition, the vector is completely insensitive to the change in the origin. What to make of two new points ${\mathcal R}$ and ${\mathcal S}$ specified by the ordered pairs

\[ {\mathcal R} = \left(\begin{array}{c} x_2 \\ y_2 \end{array} \right) = \left(\begin{array}{c} x_0 + a \\ y_0 +b \end{array} \right) \]

and

\[  {\mathcal S} = \left(\begin{array}{c} x_3 \\ y_3 \end{array} \right) = \left(\begin{array}{c} x_1 + a \\ y_1 +b \end{array} \right) \; ? \]

Depending on the choices for the values $a$ and $b$, ${\mathcal R}$ and ${\mathcal S}$ may not even fall on $\ell$ and yet they have the same vector $\vec v$ connecting them as does ${\mathcal Q}$ and ${\mathcal P}$.

Mathematicians like to draw the distinction between points and vectors but they are often clumsy about it. Take, for example A course in mathematics for students of physics, Vol. 1 by Bamberg and Sternberg. These authors identify the vector $\vec v$ as an equivalence class and they use the cluttered notation

\[ {\mathcal P} \; “+” \; \vec v = {\mathcal Q} \]

to define it in terms of the more primitive points. They also use different delimiters around the column arrays which specify the components: parentheses for one and square brackets for the other. Although it isn’t important which is used for which, note, for completeness, that the notation used in this column is opposite of Bamberg and Sternberg.

In this notation, the distinction between vector and points is front and center but at the cost of complication in the visual presentation. A parametric line would be specified as the set

\[  \ell(t) = \left\{ \left. \left(\begin{array}{c} x_0 \\ y_0 \end{array} \right) + t \left[ \begin{array}{c} v_x \\ v_y \end{array} \right] \right| t \in \mathbb{R} \right\} \; , \]

of all points related by the real number $t$, where the components of $\vec v$ are as specified above.

A cleaner way of thinking about these distinctions is to regard the relations more as computing instruction rather than as mathematical definitions. This allows a cleaner form of the notation and the defining equation

\[ \vec v = {\mathcal P} – {\mathcal Q} \]

to be interpreted as ‘the vector $\vec v$ contains the instructions on how to move from the ${\mathcal Q}$ to the point ${\mathcal P}$. The equation of the parametric line, now cast in the abstract form without the column arrays, would be the set

\[ \ell(t) = \left\{ \left. {\mathcal Q} + t \vec v \; \right| \; t \in \mathbb{R} \right\} \; , \]

of all points formed by moving a variable amount $t$ from ${\mathcal Q}$ according to the instructions held in $\vec v$.

The translation to objects in a computer program is now much more straightforward and natural than trying to parse what is meant by an equivalence class. To be clear, I am not criticizing the notion of an equivalence class nor its citation by Bamberg and Sternberg. Rather I am simply saying that viewing vectors in the context of directed line segments is much more natural in this computer-centric age.

Leave a Comment