Lost Among Notes

<< Newer Who gives a shit about tangents?
Older >> And Then …
Tags:

Differential Forms are simpler than you’ve been told

TL; DR

The great success of differential forms is to generalize determinants and cofactor expansions via the wedge product, giving a viable theory of algebraic area, and then to use this theory to relate integrals over k-dimensional “volumes” with integrals over their (k-1)-dimensional boundary “surfaces”.
Once you understand the wedge product and the importance of multi-linearity, the definition of the exterior derivative is almost inevitable, and the rest is just book-keeping.

This post explains the relationship between the various concepts that are defined in textbooks.

Abstract

I’ve struggled to understand differential forms for many years. I have several books that present differential forms and the generalized Stokes theorem at various levels of sophistication, and none of them made the subject click into place in my head.

I recently had an epiphany, not from a new book but rather from reframing and untangling bits and pieces from those books.

In this post I’ll just explain the roles of the various objects that are defined in books and materials on differential forms, and how they fit together. I won’t go into the heavy stuff, because that is well covered in those materials. The complete theory requires quite a bit of algebraic machinery. This post is meant to help you through the journey.

Red herrings and obscurity

Books on differential forms and Stokes’s Theorem pay a heavy toll for handling some historical baggage:

  • A smooth transition from “div, grad, curl” and the classic integral theorems
  • The dx notation in integrals

I think those are red herrings. Differential forms and generalized Stokes are clearer than the classical path. And the dx notation for integrals is not really something that adds much insight, and to be honest, there is less to it than meets the eye.

The books tend to be either full-on axiomatic and algebraic, or frustratingly informal and vague, defining differential forms as “something you integrate” or as “a formalism”. Yikes, what a silly thing to say.

Forms are generalizations of determinants

Determinants are defined on square matrices, or viewed from another vantage point, are defined to take N vectors in an N-dimensional vector space, and compute the “volume” of the parallelepiped they span.

When learning the theory of determinants, you probably saw them characterized as multilinear, alternating functions, motivated by the need to measure volumes, and to detect linearly dependent sets of vectors.

A form is, like a determinant, a multilinear, alternating function into R. Like a determinant, a form measures K-dimensional volumes in an N-dimensional vector space, but… when K is smaller than N, something unexpected happens. A form does not work equally well for all K-dimensional volumes. A form may measure 0 for a given set of K linearly independent vectors, something that could never happen for determinants.

For example, in R3 we can define a 2-form ψxy that measures the projected area of a parallelogram spanned by 2 vectors into the x-y plane.

We can define ψxy as the 2×2 determinant of the x-y components of the vectors. Or, said another way, ψxy is the minor that results from discarding the z component of the two input vectors.

ψxy(a,b)=ψxy(axbxaybyazbz) =|axbxayby| =ax bybx ay

This function is easily seen to be bilinear and alternating. Note that we could have two linearly independent vectors a,b such that ψxy(a,b)=0.
For example (111) and (112) are linearly independent vectors such that ψxy takes value 0 for the pair. It does make sense, as they don’t project an area on the x-y plane.

Once you realize this, it’s easier to understand all the trouble the books will go through to prove that the space of multilinear alternating K-forms on N is a vector space with dimension (NK).

In R3, the space of 1-forms and the space of 2-forms both have dimension 3.
Let’s define a basis of 1-forms:

ϕx(v)=vx   ϕy(v)=vy   ϕz(v)=vz

And a basis of 2-forms:

ψxy(a,b)=|axbxayby|   ψyz(a,b)=|aybyazbz|   ψzx(a,b)=|azbzaxbx|

Note that we said that the space K-forms on N is a vector space with dimension (NK). As a special case, when K equals N, the space of K-forms is 1-dimensional. Which we knew, since the determinant is fully characterized as the alternating multilinear form such that det(e1,,en)=1 (where the ei are the basis vectors of R3).

The wedge product generalizes determinant cofactors

When computing determinants, one will generally use the Laplace expansion, aka cofactor expansion, to decompose a determinant into a sum of smaller determinants.

Working downward on dimension, we compute a 3×3 determinant as a weighted addition of 2×2 minors.

Or, working upward on dimension: given that we have 2×2 determinants in the x-y plane to measure surface area, how could we build 3×3 determinants to compute volumes with the extra z dimension?

We’ve defined the basis 1-form ϕz and basis 2-form ψxy. We would expect the “volume” in 3-space to be Area × Height, something like vol(a,b,c)=ϕz(c) ψxy(a,b).

However, note that the expression above could give a positive volume for a degenerate 2-dimensional parallelepiped.

Let’s choose:

a=(1,0,1)T,  b=(0,1,0)T,  c=a=(1,0,1)T

and observe:

ϕz(c)=1  ψx,y(a,b)=1vol(a,b,c)=1

The failure here is we combined ϕz with ψxy but did not get an alternating form.

The wedge product can combine the forms ψxy and ϕz, to create an alternating 3-form, which is none other than the determinant in 3-space. Let’s look at a 3×3 determinant’s cofactor expansion.

|axbxcxaybycyazbzcz|=az|bxcxbycy|bz|axcxaycy|+cz|axbxayby|

Notice that

az=ϕz(a)

and

|bxcxbycy|=ψxy(b,c)

So,

det(a,b,c)

=ϕz(a) ψxy(b,c)ϕz(b) ψxy(a,c)+ϕz(c) ψxy(a,b)

=ϕzψxy (a,b,c)

With the previous example a,b,c defining a degenerate 2-dimensional parallelepiped, we’d get:

det(a,b,c)

=ϕz(a) ψxy(b,c)ϕz(b) ψxy(a,c)+ϕz(c) ψxy(a,b)

=1×ψxy(b,c)0×0+1×ψxy(a,b)

=1×ψxy(b,a)+1×ψxy(a,b)

=0

since a=c and ψxy is alternating.

In determinants, one ends up computing a sum made up of products of permuted vector components.

det(A)=σSn(sign(σ)i=1nai,σi)

ref: wikipedia

The wedge product generalizes this: given two alternating forms, it shuffles them in such a way that the result is also an alternating form. And the crazy thing is, by this simple insistence on getting another multilinear alternating function, we end up with a viable and consistent theory of algebraic volume.

The proofs and computations around the wedge product are a bit of a slog, but knowing that they’re just repeating the determinant’s trick of permuting components, it all makes more sense.

Differential forms are just form fields

A vector field is an assignment of a vector to each point in a space. A differential form is an assignment of a form to each point in a space. That’s it, there’s nothing more there. Differential forms should be called form fields. And that’s what I’ll call them for the remainder of this post.

We already know a 1-form field: the derivative. In the generalized realm of functions in vector spaces, the derivative of a function f:VR at a point x0 is the linear map that approximates the function near x0.

Df|x0(h)f(x0+h)f(x0)

The |x0 notation expresses that Df has a dependency on x0, while the “linear” argument is h. By convention we’ll say that f itself is a 0-form, i.e. it takes no “linear” argument. So, Df takes 1 more linear argument than f.

In several books there is a long preparation leading to the definition of a differential form. Really, it’s nothing much on top of forms. And the term differential forms makes them sound all too fancy. I don’t see anything intrinsically “differential” about them.

Integrals in the small are approximately forms

In the general setting of functions on vector spaces, the derivative of a function at a point is the linear map that best approximates the function in a small environment of the point.

In the same spirit, we can examine the integral of a function f in a very small parallelepiped around a point p, spanned by k vectors. Let’s call the parallelepiped Sp, and the vectors that span it v1,,vk. If the function f is well behaved, we can consider it approximately constant in Sp. Then by analogy to single-variable calculus, we could say the value of the integral on Sp is f(p) times the volume of Sp. We know how to compute volumes: forms.

For a small enough parallelepiped Sp around p then, the integral of f would be:

Spff(p) ω(v1,,vk)

for some k-form ω. Note that f(p) ω is also a k-form, so in the small, the integral is a k-form.

This point especially, explains why forms are useful to the theory of integration. While the derivative is the linear function that approximates a given function locally, a form field approximates integrals over k-volumes locally.

Forms dictate how domains of integration count for the integral

We’ve seen the basis forms ϕx and ψxy=ϕxϕy. We’d like to integrate them.

In n-space, even the simplest integral needs to choose an direction. Looking at the the integral as the limit of a sum, let’s define a line L as the domain of integration.

Let’s integrate fϕx over L by decomposing L into a series of segments lm. The function f would take values in points of the form xm=p+l1++lm.

Lfϕxm=1Mf(xm)ϕ(lm)

Imagine L is a vertical line, that is lm=cmey.

Then the ϕx(lm)=cmϕx(ey)=0 all vanish, and the integral is 0.

Now imagine L is horizontal and lm=cmex. Then ϕx(lm)=cmϕx(ex)=cm, and

m=1Mf(xm)ϕ(lm)=m=1Mf(xm)cm

And this looks very much like a plain old integral from single-variable calculus.

The form ϕx dictates that the vertical component of the domain of integration is ignored, and that if our domain of integration is “parallel to” ϕx, then the integral is just a plain old integral.

In the rest of this post we only need to consider domains of integration that are perpendicular or parallel to the form being integrated.

The exterior derivative is … what you’d expect

Having established that in the small, integrals are approximately forms, and that forms can be combined into higher-dimensional forms, the exterior derivative is almost low hanging fruit.

Let’s remember that, for a function f:VR,

Df|x0(h)f(x0+h)f(x0)

We’re going to invent the new operation of the exterior differential d. The first “rule” of it is for functions f:VR, i.e. for 0-forms. The exterior differential for 0-forms is the same as the derivative:

df|x0(h)=Df|x0(h)f(x0+h)f(x0)

Let’s imagine we have a k-form field. The general k-form field is a linear combination of basis k-forms, and as we saw already, there are (nk) such basis k-forms in a space of dimension n.

ω(v1,,vk)=i=1(nk)fi(x1,,xn) ϕσ1,,σk(v1,,vk)

Note that the functions fi are defined on the ambient vector space with dimension n, NOT k.

We lose no generality by looking at a single summand:

fi(x1,,xn) ϕσ1,,σk(v1,,vk)

For convenience, let’s just call this f(x) ϕ(v1,,vk).

Now, a differential or exterior derivative ought to be found by taking differences at points that are close, just as with the derivative of a function:

f(x+h)ϕ(v1,,vk)f(x)ϕ(v1,,vk)

=[f(x+h)f(x)] ϕ(v1,,vk)

df|x(h) ϕ(v1,,vk)

The combination has one more input vector h, over v1,,vk. So, we’re combining the forms df|x and ϕ. Let’s aim to get an alternating form as a result. Given that ϕ is k-dimensional, the result will be an alternating k+1 form. We can get this by using the wedge product. We define:

d(fϕ)|x=df|xϕ

It’s important, that’s why it gets a frame.
Let’s go back to this part:

f(x+h)ϕ(v1,,vk)f(x)ϕ(v1,,vk)

That’s a mini-integral at point x+h, and a mini-integral at point x with negative sign. Both are k-forms so they measure k-volumes. d(fϕ)|x is a k+1 form, so it’s a mini-integral measuring k+1-volumes.

Using the |x notation to clarify the arguments we’re interested in:

d(fϕ)|x(h,v1,,vk)f|x+hϕ(v1,,vk)f|xϕ(v1,,vk)

Bringing it all together … stopping before Stokes

The last equation smacks of the fundamental theorem of calculus. Let’s corroborate this.

We’re going to pick a very simple domain of integration. A n1 dimensional base B=a1 e1××an1 en1. Height given by H en. Domain of integration Sp=B×H en, at point p. Let’s divide H into m small increments h=Hm.

l=1md(fϕ)|x+(l1)h(h,v1,,vk)

using the formula from last section…

l=1m[f|x+lhϕ(v1,,vk)f|x+(l1)hϕ(v1,,vk)]

=f|x+Hϕ(v1,,vk)f|xϕ(v1,,vk)

As happens in single-variable calculus, the sums of differences have a telescoping effect, and collapse to a single subtraction. One would suspect:

Spd(fϕ)=(p+Hen)×Bfϕp×Bfϕ

Let’s prove it.

d(fϕ1ϕn1)=(m=1nxmfϕm)ϕ1ϕn1

=xnfϕnϕ1ϕn1=(1)n1xnfϕ1ϕn

Note that reordering the ϕi as we have done above incurs a possible change of sign. Each transposition of two 1-forms across adds a sign reversal. Hence the (1)n1 factor in the equation above.
Much as I commented that getting div, curl, grad is a red herring, it is satisfying to compute

d(fϕx+gϕy)=(gxfy) ϕxϕy

just so easily. Yes, I know, that was implicitly done in dimension 2. If you did it in dimension 3, z and ϕz would enter the picture, and you’d get the curl.

OK, back to our proof:

Spd(fϕ1ϕn1)=(1)n1Spxnfϕ1ϕn=(1)n1B(pp+Henxnfϕn) ϕ1ϕn1=(1)n1B(f|p+Henf|p) ϕ1ϕn1

=(1)n1(Bf|p+Hen ϕ1ϕn1Bf|p ϕ1ϕn1)=(1)n1((p+Hen)×Bf ϕ1ϕn1p×Bf ϕ1ϕn1)

We thus get our proof of a sort of “Fundamental Theorem of Calculus”, but we’re not quite there yet.

These two integrals are across the n1 dimensional base B, and their opposite sign shows their opposite orientation. We can think of them as the integral of fϕ1ϕn1 across the lid and base of the domain Sp. And, noting that ϕ1ϕn1 will be 0 for all the other “walls” of Sp, we can say that the two summands compute the total integral of

fϕ1ϕn1

across the oriented boundary of Sp. The boundary will incorporate book-keeping for that (1)n1 we encountered above.

Spd(fϕ)=(p+Hen)×Bfϕp×Bfϕ=Spfϕ

We’ve only done one term fϕσ1ϕσk while there are (NK) such terms; what remains is just keeping your notation straight, and is well handled by books and materials on forms and Stokes.

It’s interesting to realize that the fundamental part of this result, and the fundamental part of Stokes’s Theorem as done with forms, is the definition of the exterior derivative.

Once we got here:

d(f ϕ1ϕn1)=xnf ϕnϕ1ϕn1

everything else was simple follow-through.

Michael Spivak writes this in Calculus on Manifolds:

Stokes’ theorem shares three important attributes with many fully evolved major theorems:

  1. It is trivial.
  2. It is trivial because the terms appearing in it have been properly defined.
  3. It has significant consequences.

There’s more… elsewhere

There are more pieces to the narrative, which you can find in the books. Majorly missing from this post:

  • integrating the general form with its (NK) terms
  • generalized “cubes” and their boundaries in N-space via chains
  • integrating over non-rectangular domains via pullbacks
  • a proper discussion of orientation
  • defining exterior differentiation in a coordinate-free way

I haven’t found a single book that made everything click, and several that are popular left me frustrated so I’ll ignore them. However, three books I recommend:

  • Harold Edwards Advanced Calculus: A Differential Forms Approach
  • Serge Lang Undergraduate Analysis
  • Bamberg & Sternberg A course in mathematics for students or physics

Edwards and Bamberg & Sternberg both are introductory and try to motivate things. Lang gives a quick intro without much motivation and gets to Stokes with ease and no fuss.

Honorable mention to Michael Spivak’s Calculus on Manifolds and A Comprehensive Introduction to Differential Geometry. They are a bit uneven, in that Spivak just goes full-steam on multilinear algebra without much in the way of motivation, but then, they do get to the point fast, and Spivak knows how to tell a story with math.

Godspeed.