Matrix calculus cheatsheet

I often get lost in the indices when dealing with differentiation wrt multi-dimensional quantities. So writing up some intuitions and identities that are useful. The cheatsheet from Imperial College Department of Computing is also quite useful (at the bottom of this post):

Gradient points in direction of steepest ascent

Consider a function or two variables f(x,y) which you can imagine plotting in a 3d graph (a surface with the values of f on the z axis).

The gradient of f is deinfed as nabla f = column vector partial df/dx, df/dy.

at a given point (x,y), if we ask how much would f change if we moved slightly in a direction v (v1, v2) then the answer would be v1 df/dx +v2 df/dy partial which is just the dot product [dot product in latex].

fixing the size of v this is maximised when v is parallel to nabla f. Geometrically we can easily see that this is maximised when you point orthogonal to the current contour (going directly to the next contour). So nabla f points in direction of steepest ascent

Matrix calculus identities

The below are a set of useful identities where I have found that using the einstein summation notation (summing over repeated indices) is a good way to derive them quickly.

eg to prove that $$\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^\top A \mathbf{x} \right) = A\mathbf{x} + A^\top \mathbf{x}$$

Let $$y = \mathbf{x}^\top A \mathbf{x} = x_i A_{ij} x_j$$

Then
\begin{align}
\frac{\partial y}{\partial x_k} &= \frac{\partial}{\partial x_k} \left( x_i A_{ij} x_j \right) \\
&= \left( \frac{\partial x_i}{\partial x_k} \right) A_{ij} x_j + x_i A_{ij} \left( \frac{\partial x_j}{\partial x_k} \right) \\
&= \delta_{ik} A_{ij} x_j + x_i A_{ij} \delta_{jk} \\
&= A_{kj} x_j + x_i A_{ik} \\
&= (A \mathbf{x})_k + (A^\top \mathbf{x})_k \\
\end{align}

ie we have $$\frac{\partial}{\partial \mathbf{x}} \left( \mathbf{x}^\top A \mathbf{x} \right) = A \mathbf{x} + A^\top \mathbf{x}$$

Cheatsheet from Imperial College

2 thoughts on “Matrix calculus cheatsheet”

  1. Pingback: Linear Regression – Maths with Ronak

  2. Pingback: Lagrange multipliers and optimisation Intuition – Maths with Ronak

Leave a Comment

Your email address will not be published. Required fields are marked *