RE: Is a gradient just jacobian with a single row?
Yes, you're on the right track. In vector calculus, the gradient is a key operation which takes a scalar-valued function and creates a vector-valued function (vector field). In essence, the gradient of a function gives you a vector pointing in the direction of the greatest rate of increase of the function, and the magnitude of that vector is the rate of increase in that direction.
On the other hand, the Jacobian matrix is a matrix of all the first-order partial derivatives of a vector-valued function, a generalization of gradient. In simpler terms, the Jacobian matrix represents how changes in the input of a function result in changes in its output.
If your function has only one component and is real-valued, the Jacobian for your function reduces down to a single row; this row is what we generally refer to as the gradient. But remember, for functions with multiple outputs, the Jacobian will be a whole matrix, not just a single row. It encapsulates information about the rate of change in every direction, for every output direction. Thus, in a sense, you can say a gradient is a Jacobian with a single row for a function from โแต to โ.
This connection between the gradient and the Jacobian is a good starting point to understanding how higher dimensions work, and how we can generalize concepts from 3 dimensions to any number of dimensions. If you're interested in the mathematics of machine learning, this concept is significant because machine learning often works with large dimensions.