There are two commonly seen definitions of the dot product, which at first glance don’t seem obviously related:
- The algebraic definition:
\(\mathbf{x} \cdot \mathbf{y} = x_1 y_1 + x_2 y_2 + \cdots + x_n y_n\) - The geometric definition:
\(\mathbf{x} \cdot \mathbf{y} = |\mathbf{x}| |\mathbf{y}| \cos \theta\)
where \(\theta\) is the angle between the vectors \(\mathbf{x}\) and \(\mathbf{y}\), and \(|\cdot|\) denotes the Euclidean norm
A bridge between these two perspectives comes from thinking about linear maps — particularly, the linear map that projects any vector onto a line defined by a unit vector \(\mathbf{u}\). This framing is nicely illustrated in 3Blue1Brown’s video, from which I’m borrowing key ideas.
Step 1: Projection onto a unit vector
Let \(\mathbf{u} = \begin{bmatrix} u_x \ u_y \end{bmatrix}\) be a unit vector in \(\mathbb{R}^2\). Consider the linear map \(P_{\mathbf{u}}: \mathbb{R}^2 \to \mathbb{R}\) that sends any vector \(\mathbf{v}\) to its projection onto the line spanned by \(\mathbf{u}\), measured as a scalar (i.e., how far along \(\mathbf{u}\) you go to reach the projection).
Since this map is linear, it must be representable by a \(1 \times 2\) matrix. To find this matrix, observe how it acts on the standard basis vectors:
- \(\mathbf{e}_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\)
- \(\mathbf{e}_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\)
Because both \(\mathbf{e}1\) and \(\mathbf{u}\) are unit vectors, the projection of one onto the other is symmetric, so we have:
$$P_{\mathbf{u}}(\mathbf{e}_1) = P_{\mathbf{e_1}}(\mathbf{u}) = u_x \\
P_{\mathbf{u}}(\mathbf{e}_2) = P_{\mathbf{e_2}}(\mathbf{u}) = u_y$$
So, the matrix representing this linear map w.r.t the standard basis (recall that any matrix representation of a linear map is dependent on the choice of basis) is:
$$Matrix(P_{\mathbf{u}}) = \begin{bmatrix} u_x & u_y \end{bmatrix}$$
Then for any vector \(\mathbf{v} = \begin{bmatrix} v_x \\ v_y \end{bmatrix}\), the projection scalar is:
$$P_{\mathbf{u}}(\mathbf{v}) = \begin{bmatrix} u_x & u_y \end{bmatrix} \begin{bmatrix} v_x \\ v_y \end{bmatrix} = u_x v_x + u_y v_y$$
which is the same as the numerical definition 1 with the product of the corresponding entries.
Step 2: Generalising to non-unit vectors
Now suppose we have a non-unit vector \(\mathbf{u} \in \mathbb{R}^2\). We can still define a map \(Q_{\mathbf{u}}\) that:
First projects onto the line in the direction of \(\mathbf{u}\), using the unit vector \(\hat{\mathbf{u}} = \frac{\mathbf{u}}{|\mathbf{u}|}\)
Then scales the result by \(|\mathbf{u}|\), so we get:
$$
Q_{\mathbf{u}}(\mathbf{v})
= |\mathbf{u}| * P_{\hat{\mathbf{u}}}(\mathbf{v})
= |\mathbf{u}| * (\hat{\mathbf{u}} \cdot \mathbf{v})
= \mathbf{u} \cdot \mathbf{v}
$$
Thus, even when \(\mathbf{u}\) is not a unit vector, the dot product \(\mathbf{u} \cdot \mathbf{v}\) can be interpreted as a scaled projection — where you first project \(\mathbf{v}\) onto the direction of \(\mathbf{u}\), and then scale by the length of \(\mathbf{u}\)