An Introduction to Calculus of Variations

What is Calculus of Variations?

In general, developments in mathematics are motivated by the need for them in applications. Calculus of variations is no exception. In fact, it was first developed in 1969 when Johann Bernoulli asked the greatest mathematical minds of his time to solve the famous ‘brachistochrone problem’. This problem involves a bead sliding down a frictional wire attached to two points, and asks what shape the wire should take so that the bead takes the least time to travel from one point to another [1]! This is a minimization problem, but it is important to note that the ‘basic problem’ of calculus of variations involves making a certain quantity stationary, rather than just minimizing it (recall that a function of one variable can have three types of stationary points, only one of which is a minimum). You will have certainly solved problems in which we wish to make quantities stationary before. For example, if one wishes to find the  x for which the function  f(x) is stationary, you simply differentiate  f(x) and then set its derivative to zero, i.e. solve  f'(x)=0 . However, what if I asked you to find a  y(x) which makes the integral   I = \int_{x_{1}}^{x_{2}}F(x, y, y')dx  stationary? This is not a problem which can be solved in the same manner as the one before. Do not despair though, this is where calculus of variations comes in!

In fact, making an integral stationary is a surprisingly prevalent problem within physics and mechanics. Many physical laws can be stated in terms of stationary integrals. Two examples include Fermat’s principle, which states that light will take a path between two points which makes the time it takes to travel the path stationary [2], and Hamilton’s Principle which is a reformulation of Newton’s Laws and says that any particle or system of particles always moves in such a way that   I = \int_{t_{1}}^{t_{2}}L dt  is stationary, where   L = T - V ,  T =  kinetic energy and   V =  potential energy [3].

Calculus of variations can also be used to find geodesics. A geodesic is the shortest path between two points on a surface. For example, the shortest distance between two points in a plane is given by a straight line. This extremely intuitive fact can actually be proven rigorously using calculus of variations (and we shall do so in this article)! Similarly, and less trivially, the geodesic of any surface can be found using calculus of variations.

Deriving the Euler Equation

Here we will derive the Euler Equation, namely, the condition that  F(x, y, y') must satisfy to make stationary the integral  I =  \int_{x_{1}}^{x_{2}}F(x, y, y')dx  .

Derivation [4]

First, let us consider the definition of a functional. The integral   I[y(x)] = \int_{x_{1}}^{x_{2}}F(x, y, y')dx  is known as a functional. The formal definition of a functional is a real-valued function on a vector space V[5]. This may seem abstract at first glance, but it is easy to understand when you break it down. First of all, ‘real-valued’ simply means that the functional always returns a real number for any input. This is clearly the case for our integral  I . Secondly, a functional is defined on a vector-space rather than the real number line like functions you may be used to. If our vector space is a space of functions (recall that this is possible due to the abstract definition of a vector space – it can be a space of combinations of any mathematical object, not just vectors) then the functional takes inputs that are functions! This is the usual case, and certainly the case that we will be dealing with here. As a summary, the functionals that we will deal with take functions as an input, and output real numbers.

In our derivation we will be assuming that our functional   I[y(x)] =  \int_{x_{1}}^{x_{2}}F(x, y, y')dx is defined on a space of functions with fixed endpoints i.e. a space for which  y(x_{1}) = y_{1} and  y(x_{2}) = y_{1} [6]. This is not such a restrictive assumption as it is often the case in problems. For example, finding the shortest distance between two points in a surface has predefined endpoints.

Now, recall that  y(x) is any function defined on the interval  [x_{1},x_{2}] (which satisfies the assumption above). Therefore, we can rewrite it as the sum of two functions,  y(x) = Y(x) +\epsilon g(x) where  Y(x) is the extremal we seek (i.e. the function which makes the integral stationary),  g(x) is any function for which  g(x_{1}) = g(x_{2}) = 0 (a necessary condition due to our predefined endpoints) and  \epsilon is a scalar variable.

When  y(x) is defined in this manner our integral  I becomes a function of the variable  \epsilon . We wish to find a function  Y(x) for which   \frac{dI}{d\epsilon} = 0  when  \epsilon = 0 (because we wish the integral to be stationary when  y(x) = the extremal, i.e.  y(x) = Y(x) .

I.e. we want  \frac{d}{d\epsilon}\int_{x_{1}}^{x_{2}}F(x, y, y')dx  = 0 when  \epsilon = 0 .

Now,  \frac{d}{d\epsilon}\int_{x_{1}}^{x_{2}}F(x, y, y')dx  = \int_{x_{1}}^{x_{2}}\frac{\partial}{\partial\epsilon}F(x, y, y')dx   = \int_{x_{1}}^{x_{2}}\frac{\partial F}{\partial y}\frac{\partial y}{\partial \epsilon} + \frac{\partial F}{\partial y'}\frac{\partial y'}{\partial \epsilon}dx .

Furthermore,   \frac{\partial y}{\partial\epsilon} = g(x) and  \frac{\partial y'}{\partial\epsilon} = g'(x) , and when  \epsilon = 0 ,  y(x) = Y(x) + \epsilon g(x) = Y(x) and  y'(x) = Y'(x) +\epsilon g'(x) = Y'(x),

Therefore,  \int_{x_{1}}^{x_{2}}\frac{\partial F}{\partial y}\frac{\partial y}{\partial \epsilon} + \frac{\partial F}{\partial y'}\frac{\partial y'}{\partial \epsilon}dx = \int_{x_{1}}^{x_{2}}\frac{\partial F}{\partial Y}g(x) + \frac{\partial F}{\partial Y'}g'(x)dx .

If we integrate the second term by parts we obtain,  \int_{x_{1}}^{x_{2}}\frac{\partial F}{\partial Y}g(x) + \frac{\partial F}{\partial Y'}g'(x)dx = \int_{x_{1}}^{x_{2}}\frac{\partial F}{\partial Y}g(x)dx + \left | \frac{\partial F}{\partial Y'}g(x) \right |^{x_{2}}_{x_{1}}- \int_{x_{1}}^{x_{2}}\frac{d}{dx}(\frac{\partial F}{\partial Y'})g(x)dx.

Now, because  g(x_{1}) = g(x_{2}) = 0 ,  \left | \frac{\partial F}{\partial Y'}g(x) \right |^{x_{2}}_{x_{1}} = 0 and we can therefore eliminate the second term leaving only  \int_{x_{1}}^{x_{2}}g(x)(\frac{\partial F}{\partial Y} - \frac{d}{dx}\frac{\partial F}{\partial Y'})dx = 0 .

We are almost done now. Recall that g(x) is an entirely arbitrary function. Because of this, for the integral to always be zero (for any  g(x) ) our function  \frac{\partial F}{\partial Y} - \frac{d}{dx}\frac{\partial F}{\partial Y'} must equal zero! This is obvious intuitively (for any other function it would be possible to choose a  g(x) for which the integral is not zero e.g. choose a  g(x) which is positive when  (\frac{\partial F}{\partial Y} - \frac{d}{dx}\frac{\partial F}{\partial Y'}) is positive and negative when  (\frac{\partial F}{\partial Y} - \frac{d}{dx}\frac{\partial F}{\partial Y'}) is negative). Furthermore, this is a statement verified by the fundamental lemma of calculus of variations, which states that if  M is continuous and  \int_{a}^{b}Mg(x)dx = 0 for all infinitely differentiable  g(x) then  M(x) = 0 on the open interval  (a,b) . [7]

Therefore,  \frac{\partial F}{\partial Y} - \frac{d}{dx}\frac{\partial F}{\partial Y'} = 0 and we have obtained the Euler Equation!

A Quick Application

To briefly give an example of the Euler equation in use we will derive the geodesic of the Cartesian plane.

Here we wish to minimize the integral  I = \int_{x_{1}}^{x_{2}}\sqrt{1+y'^{2}}dx . Therefore,  F = \sqrt{1+y'^{2}} ,  \frac{\partial F}{\partial y'} = \frac{y'}{\sqrt{1+y'^{2}}} and  \frac{\partial F}{\partial y} = 0 . Plugging theses results into the Euler equation we get  \frac{d}{dx}\frac{y'}{\sqrt{1+y'^{2}}}=0 or  \frac{y'}{\sqrt{1+y'^{2}}} =   constant.

Clearly, this is only true when  y' = constant, and therefore the path is a straight line!



[1], accessed January 13, 2018

[2] Fermat’s Principle, Encyclopædia Britannica, URL, accessed January 13, 2018

[3] Mathematical Methods in the Physical Sciences Third Edition, Mary L. Boas, Chapter 9

[4] An adapted and extended version of the proof given in [3]

[5] Rowland, Todd. “Functional.” From MathWorld— A Wolfram Web Resource, created by Eric W. Weisstein.

[6], accessed on January 13, 2018

[7] Weisstein, Eric W. “Fundamental Lemma of Calculus of Variations.” From MathWorld— A Wolfram Web Resources.


Comments 1

Leave a Reply