Rate of change is very useful (eg. velocity is distance w/respect to time). Differentiation is the method of finding the rate of change given a function. And, given multiple independent variables, partial differentiation allows us to explore the rate of change with respect to each independent variable.
Integration can be seen as the inverse of differentiation, hence the alias anti-derivative.
Considering the example v(t)=ΔtΔs just to see that by Δt we mean an interval of time. It’s technically the average velocity over that unit of time. A Function is a relation between two sets that associates every element of one set to exactly one element of the other set.
We kind of brush over the concept of a limit, which is quite critical to the fundamentals of Calculus, but typically not invoked directly (so understandable). A limit is a self describing term for the result of a function, the dependent variable, as the independent variable(s) approach their limit’s value.
The slope of a curve is typically denoted as:
m=ΔxΔy
definition 1.1
However, for actual curved shaped curves, non-linear relationships typically, the slope is not exact. We can approximate the slope by decreasing the size of Δx, until it is indistinguishable from 0. This gives us the change in y per infinitesimal change in x. The slope which would intercept the curve at both x points now only touches the curve in the one spot. Geometrically, we refer to it as the tangent line, the slope of the line tangent to the function f(x) at the point x.
Additionally, In math, equivalent is different from equal. Equal means same in all aspects, whereas equivalent means similar but not identical. Use the equal sign to express an identity, and equivalent to express, say, same truth values.
Mathematically, it looks more like:
f′(x)≡dxdf(x)≡Δx→0limΔxf(x+Δx)−f(x)
definition 1.2
Note that (the importance of limits) a function is differentiable at xt if, and only if (IFF), the limit exists at point x=xt. For the limit to exist, the definition of the limit requires that the quotient Δxf(x+Δx)−f(x) approaches the same value, f′(x), from both the left and right. That’s more limit talk.
EXAMPLE
Lets find f′(x) for f(x)=x2.
Define a derivative.
f′(x)=Δx→0limΔxf(x+Δx)−f(x)
Substitute value of x.
=Δx→0limΔx(x+Δx)2−x2
Expand Polynomial.
=Δx→0limΔxx2+2xΔx+Δx2−x2
x2−x2=0.
=Δx→0limΔx2xΔx+Δx2
Divide by Δx.
=Δx→0lim2x+Δx
Let Δx→0 (basically substitution).
=2x
The trick is that Δx becomes infintessimally small as it approaches zero, but is always non-zero. Hence, does not cause function to become undefined.
Additionally, for a function to be differentiable at xt, a function must be continuous at xt. If the function were not continuous at that point, it would also not have a limit. However, being continuous is a weak criteria. Take f(x)=∣x∣ for example. The derivative taken at x=0 from the left is −1, but from the right is +1.
Using definition 1.2 and the laws of limits, we can find derivatives of many fundamental functions. Let n>0 be a natural number and a be a real-valued constant.
dxdxn=nxn−1
dxdeax=aeax
dxdln(ax)=x1
dxdsin(ax)=acos(ax)
dxdcos(ax)=−asin(ax)
dxdtan(ax)=cos2(ax)a
What we have been discussing is just the first derivative of a function. The derivative goes by other name, such as slope or gradient.
Higher Order Derivatives
A derivative of a derivative is a higher order derivative. The results are obtained by applying the definition of a derivative onto the result of itself.
f′′(x)≡Δx→0limΔxf′(x+Δx)−f′(x)
definition 1.3
More generally, for the nth derivative;
fn(x)≡dxdfn−1≡Δx→0limΔxfn−1(x+Δx)−fn−1(x)
definition 1.4
… whenever the limit exists.
EXAMPLE
Using the definition for nth derivative, calculate the f′′(x) for f(x)=ax2+bx+c.
Imagine the parabola of f(x)=x2. At x=0, the function achieves a local minimum. Graphically, the line tangent to the graph at this point is horizontal, with a slope m=0.
For f(x)=∣x∣, x=0 is a local extrema, a critical point defined to be a place where the derivative is zero or does not exist.
There are 3 different stationary points:
Function has maximum at stationary point, x=a if f′(a)=0 and f′′(a)<0.
Function has minimum at stationary point, x=a if f′(a)=0 and f′′(a)>0.
Stationary point at x=a is called a saddle point if f′(a)=0 and f′′(a) changes sing at the point.
Worth noting that these are only for local min-max, and not the functions’ global min-max point.
EXAMPLE
Find the stationary points for f(x)=3x2−6x. Determine if the points are minimum, maximum, or saddle.
The points are found in the first derivative, and described in the second derivative.
We will use techniques over the general derivative definition
f(x)f′(x)f′′(x)=3x2−6x=6x−6=6
Now, you can solve for f′(x)=0 and see the stationary point is at x=1.
Also, since f′′(x) is positive, the rate of change will be increasing at this point. We don’t need to check for saddle because it’s a constant. Instead, you can envision this as a bowl shape, meaning this point is a minimum.
EXAMPLE
Suppose we have 6m of framing material and we want to build a rectangular window with sides x and y. Choose the values of x and y to maximize the area because we want a lot of sunlight.
We know that the area is A=xy, and we have C=2x+2y amount of material. We want to maximize the output, so we need a function to solve for the derivative at 0.
Rearrange circumference formula
C6yy=2l+2h=2x+2y=26−2x=3−x
Pop that into the Area
A=xy=x(3−x)=−x2+3x
Then, go for the derivative
A′(x)=−2x+3
And solve x for A′(x)=0
A′(x)02xx=−2x+3=−2x+3=3=3/2
Solve for y=3−3/2=6/2−3/2=3/2.
It shouldn’t be too much of a surprise that the largest area comes from a square.
□
Rules of Differentiation
Differentiation of Functions with a Constant
Let a∈R and be an arbitrary constant and g(x) be some function.
dxdf(x)=f′(x)=adxdg(x)=a×g′(x)
Differentiation of Products
Suppose our function is the product of multiple functions (eg. f(x)=u(x)×v(x)). Decomposing a function into product parts may ease the strain of applying the definition of a derivative to the entire function. The idea is to choose a u(x) and v(x) that are easy to differentiate. Basically, we are about to derive the product rule of differentiation…
Define a derivative:
f′(x)=Δx→0limΔxf(x+Δx)−f(x)
Substitution of u(x)×v(x)
=Δx→0limΔxu(x+Δx)×v(x+Δx)−u(x)×v(x)
Since u(x+Δx)×v(x)−u(x+Δx)×v(x)=0, we add that in (tricky substitution):
It might be hard to imagine right now, but sometimes it is easier to think of functions as functions of functions! Consider f(x)=(x−1)2. If we let u(x)=x−1, then f(x)=u(x)2.
dxdy=dudydxdu
definition 1.6
Let’s walk through an example.
define our function
f(x)=(x−1)2
Substitute
f(u(x))=u(x)2whereu(x)=x−1
Define derivative
dxdy=dudydxdu
What is dudy?
f(u)dudy=u2dudy=2u
What is dxdu?
u(x)dxdu=(x−1)dxdu=1
And therefore:
f(u(x))dxdy=2∗u∗1=2∗u
Substitute u(x)=x−1
dxdy=2(x−1)
Differentiation of Quotients
Similar to the product rule, the quotient rule finds f′(x) for f(x)=v(x)u(x). Basically, you can consider applying the product rule to f′(x) for f(x)=u(x)v(x)−1.
f′(x)=(v(x)u(x))′=v2v(x)u′(x)−u(x)v′(x)
definition 1.7
EXAMPLE
Find f′(x) for
f(x)=x3(sin(3x)−1)2
You can do this using the product rule or the Quotient rule. I like only remembering one formula so this is how I would do it… note that the v′(x) will also require using the chain rule, which is the derivative of the function times the derivative of the inside.
You can continue to factor this pulling out a 2(sin(3x)−1) from each side, but doesn’t make a huge difference.
□
Integrals of Functions of a Single Variable
Integrals as Area Under the Curve
So far, we have been using the definition of slope as m=ΔxΔy, which defines an average rate of change. We then consider progressively shorter x intervals, Δx→0 to determine the instantaneous rate of change f′(x)=dxdy.
But given the rate of change, or the function of it, can we determine the coordinate values? For example, if given the function of velocity, can we determine an objects position at a given point in time? If velocity is change of position per unit time, expressed as v(t)=ΔtΔs, then we solve for position by Δs=Δt×v(t). Graphically, this looks like base time height, which thus becomes the area under the curve.
Informally, we can approximate this value summing sub-intervals between t0 and tn. However, we will technically be summing rectangles, which will produce an error for a smooth curve. To increase accuracy, we shorten the interval, increasing the total number of intervals. And, if we let the interval length approach 0, we start our journey into integration.
Below is a function summing the intervals. Let f(x) be a function defined over the interval a<x<b. We divide our interval [a,b] into n sub-intervals such at a=ξ1<ξ2<...<ξn=b. That means f(xi)×(ξi−ξi−1) is like our base times height.
S=i=1∑nf(xi)(ξi−ξi−1)
definition 1.8
Note that the area under some curves over certain intervals is not finite (eg. f(x)=x1). However, if the limit does exist, as n→∞ we define an integral as… realizing that n may not be defined correctly above…
I=∫abf(x)dx
definition 1.9
The integral is undefined if the limit does not exist. For closed, finite intervals, the limit can exist if the function is continuous over that interval. It is both interesting and convenient to consider integration to be the summation of infinite parts.
The function to be integrated, f(x) is called the integrand. The process of integration kind of involves summing rectangles of area on the interval, and letting the width of the intervals approach 0. Certain functions in a summation can be expressed as an ordinary function, making the process easier. For example
i=1∑ni2=61n(n+1)(2n+1)
Let’s look at some interesting integral properties or identities:
∫ab0dx=0
∫aaf(x)dx=0
∫ab[f(x)+g(x)]dx=∫abf(x)dx+∫abg(x)dx
∫ac[f(x)]dx=∫abf(x)dx+∫bcf(x)dx,∀b∈[a,c]
∫abf(x)dx=−∫baf(x)dx
Integrals as inverse of differentiation
A formal definition is:
F(x)=∫axf(u)du
(1.14)
We are going to prove quick the Fundamental Theorem of Calculus, that states the derivative of the integral gives back the original integrand.
Consider:
F(x+Δx)=∫ax+Δxf(u)du
We split integral into workable pieces:
F(x+Δx)=∫axf(u)du+∫xx+Δxf(u)du
Then a quick substitute:
F(x+Δx)=F(x)+∫xx+Δxf(u)du
Now, divide both sides by Δx.
ΔxF(x+Δx)−F(x)=Δx1∫xx+Δxf(u)du
Then, we consider the limit as Δx→0
dxdF(x)=f(x)
You may also see it written with definition of F(x) substituted back in as:
dxd[∫axf(u)du]=f(x)
Typically, the definition is written as:
∫f(x)dx=F(x)+c
Where c is a constant. This is because the derivative of a constant c will be 0 and therefore the value is lost. However, when going backwards, we account for the lost value with a placeholder. We refer to c as the constant of integration.
And an interesting pattern that appears a bit in probability theory is, let x0 be an arbitrary fixed point such that x0∈(a,b). Then
Previous definitions expected the bounds of integration to be finite. However, it is often the case that one or both bounds are infinite. We can extend the definition:
∫a∞f(x)dx=b→∞lim∫abf(x)dx=b→∞limF(b)−F(a)
Where the limit as b approaches ∞ is evaluated after the integral is calculated. That is, integrate and then evaluate.
Evaluation of integrals
Unlike derivatives, integrals usually cannot be evaluated as easily. As such, we have an extend list of recipes. Note, that u is typically a function u(x), and that du is the derivative such that du=u′(x)dx. The notation can be a little confusing. Khan Academy has a huge section on integration techniques, and a u-substitution.
Page 28… Needs to be wrapped up. Even the book says that large number of integrals can be found in tables of integrals.
To evaluate unknown integrals, we try to transform them into forms that are easier to evaluate. Here’s a quick reference of some techniques:
Logarithmic integration
Decomposition = When integrand is a linear combination of integrable functions. You can split integral of sum into sum of simpler integrals:
Substitution = Essentially reverses the chain rule for derivatives. Helps to integrate composite functions. Brings back memories. You need to substitute not just the u but also find a du.
Derivative of chain rule w(u(x))→w′(u(x))u′(x)
Think, we are going backwards.
Integration by parts = Similar to substitution reversing the chain rule, integration by parts reverse the product rule.
∫uv′dx=uv−∫vu′dx
We probably need many examples.
Try to evaluate ∫abx∗cos(x)dx. Hint: look at the the above list… should make skip over all but integration by parts.
Try to evaluate the following:
∫x2+x1
From page 30. It’s a cool situation where you factor the bottom, use partial fraction decomposition, and notice things being to look a bit logarithmic.
EXAMPLE
Evaluate the following integral
F(x)=∫01x2sin(x)dx
The best way to probably handle this is integration by parts.
F(x)=∫u(x)v′(x)dx=u(x)v(x)−∫v(x)u′(x)dx
I say this because in the derivative case it would be a power rule.
At this point the reader may become annoyed realizing they, again, have to perform another integration by parts… We take out the negative from the cosine to make things easier.
You are in a car moving at speed v(t). How far has the vehicle travelled between t0=0s and t1=5s?
v(t)=t2+1t
The anti-derivative of velocity is distance. So, it’s a bounded integration problem. I am going to suggest u substitution because I don’t see integration by parts…
Taylor’s Theorem, named after Brook Taylor who expressed this relationship in 1712, provides an approximation to a function in the vicinity of a given point x0 as a sum. The theorem requires the function f(x) be continuous and that all derivatives up to order fn(x) exist in order to generate the nth degree polynomial approximation of f(x) near x0. You can always refer to Wikipedia for more information. However, per equation 1.21, which is
∫aa+ϵf′(x)dx=f(a+ϵ)−f(a)
We say that x and x−ϵ are in the vicinity of a, and rewrite the fecker as:
f(a+ϵ)=f(a)+∫aa+ϵf′(x)dx
Call this equation 1.24
Now comes the magic. We must assume that ϵ is very small. So small that we can assume f′(x)≈f′(a). A side-effect of this assumption is that
f(a+ϵ)≈f(a)+ϵf′(a)
We are kind of saying that f(a+ϵ) is f(a) plus the tiny increment multiplied by the rate of change. After all, that is actually what we kind of assume integration is actually doing under the hood. So, then we express in terms of x and a, assuming we stay close to the point a, to get the approximation
f(x)≈f(a)+(x−a)f′(a)
call this equation 1.26. Sorry the numbering is a bit off.
My example:
Let f(x)=3x2. Compare the actual and approximate values if a=6 and ϵ=0.1.
actual
f(6+0.1)=3(6.1)2=3×37.21=111.63
approximate
Firstly, we can determine that f′(x)=6x. Then,
f(6.1)≈6×62+0.1×6(6)=106+3.6=111.6
That’s not a bad approximation.
This approximation is called the linear approximation of f(x) near x=a. It is a tangent line approximation to a function f. We obtain better approximations with more information about f, which is to say, using higher order derivatives. Because f is n-differentiable, we continue to apply approximations to each derivative:
It is like integrating the ϵ portion, but taking another derivative of the f(a) function. You can continue the process either forever or until no more higher order derivatives exist. That will yield the nth-degree Taylor polynomial approximation. A better expression, which I suppose assumes that ϵ=(x−a):
We will now consider rates of change of functions that depend on more that one independent variable. Derivatives of functions of single variables are related to the change, or gradient, of that function. Consider the function z=f(x,y)=x2+y2. It has a specific gradient in all directions of the xy plane. It’s probably easiest to consider working in the 3-D space, as we can imagine it visually, before moving to higher spacial dimensions that range from difficult to impossible to imagine visually.
Wiki has a small section on partial derivatives. I mostly looked it up to see what the symbol was. I thought it was greek but I’m getting that it is just a stylish cursive ‘d’. It will indicate differentiation is performed partially with respect to a single variable, keeping other constant.
The derivative, once a tangent line of our 2-D graph, becomes a tangent plane in a 3-D graph. And you can imagine it being a tangent volume in a 4-D graph, which is why it is harder to visually represent higher dimensions. Circle back to our 3-D logic, we can think of taking a derivative now a just a multi-step problem. Before, 1 independent variable meant one, sometimes big, step. Now, we determine the rate of change on each axis, holding the other variable(s) constant. Each step is a partial derivative, which is to say, we are only finding part of the rate of change, or gradient, of the function. Let’s work up some equations.
∂x∂f=Δx→0limΔxf(x+δx,y)−f(x,y)
equation 1.28
But, don’t forget about the other independent variable…
∂y∂f=Δy→0limΔyf(x,y+Δy)−f(x,y)
equation 1.29
Some other notations might be
∂x∂f=∂x∂f(x,y)≡fx=∂xf
equations 1.31
They are different ways of writing the same thing, but I might recommend sticking with the first honestly.
You can also calculate higher order derivatives, provided that the relevant limits exist. Let’s look as some possibilities considering we only have 2 independent variables:
Under sufficient continuity conditions, the following relationship should hold:
(∂x∂y∂2f)=(∂y∂x∂2f)
I would recommend taking the time now to practice.
Find the partial first derivatives of f(x,y)=3x2y2+y. Then, find the second order derivatives from each partial with respect to the other independent variable. Compare them, are they the same?
∂x∂f=6xy2
Notice that the y becomes 0. This is because dxdy=0, hopefully that is written correctly.
∂y∂f=6x2y+1
Hopefully that makes enough sense. Kind of like just pretending that x2 doesn’t exist. Now…
This LibreTexts.org section looks like a beautiful explanation of partial derivatives.
Total Differential
I got ahead of myself asking for the whole derivative above. Basically, we want to investigate the rate of change if we move in any direction in the domain. That is, a little in the x (or Δx), and a little in the y as well.
Δf(x,y)=f(x+Δx,y+Δy)−f(x,y)
The book then does the algebraic trick of adding and subtracting by the same term, f(x,y+Δy)−f(x,y+Δy)=0. This allows to factor into desirable quotients. You can also multiply by ΔxΔx=1 and ΔyΔy=1. It looks a little funny, but if you let the deltas approach 0, you get:
df=∂x∂fdx+∂y∂fdy
equation 1.32.
And for n independent variables
df=∂x1∂fdx1+∂x2∂fdx2+...+∂xn∂fdxn
equation 1.33.
LibreTexts.org has it’s own section on the Total Differential as well that defines Total Differential similarly, but explains it a little bit better. To better understand, let Δx=dx represent a change in the independent variable x. We can assume that when dx is small, dy≈Δy, the change in y resulting from the change in x. The assumption also includes that as dx gets smaller, the difference between Δy and dy goes to 0. That is, as dx→0, the error in approximating Δy with dy also goes to 0. An interesting distinction between Δy and dy.
If we expand this logic to a function with 2 independent variables, like z=f(x,y), we would like Δx=dx and Δy=dy. Then, the change in z becomes Δz=f(x+dx,y+dy)−f(x,y). And we approximate Δz≈dz=fxdx+fydy. This means the total change in z is approximately the change cause by Δx and Δy.
It’s really just an approximation. Wikipedia has an article as well.
Chain Rule
We start with the total derivative from 1.32, and follow a similar approach to single variable derivatives:
dudf=∂x∂fdudx+∂y∂fdudy
You can go further with deeper nesting of functions as well if needed:
dxdf(u(v(x)))=∂u∂f∂v∂udxdv
That’s it for this section. A few examples would probably be nice.
EXAMPLE
Find total derivative for the following:
f(x,y)=cos(xy)
The total derivative is slightly different than regular. It’s the sum of first order derivatives.
We will again look at a function f(x,y) with specific bounds in both the x and y directions, represented by a region R, enclosed by a contour C. Following the approach from before, we divide the region into N areas of ΔAp. And we sum the product of the area times the small change in the dependent variable.
S=p=1∑Nf(xp,yp)ΔAp
Of course, this is monotonous, yet non-trivial, algebra. Calculus comes in as we let N→∞, which implies that ΔAp→0.
I=∫Rf(x,y)dA
equation 1.35
where we consider dA to be an infinitesimally small area in the (x,y) plane.
Now, if we assume to choose small rectangles in the x and y directions, we write ΔA=ΔxΔy. As both independent variable deltas tend to zero, we write
I=∫∫Rf(x,y)dxdy
equation 1.36.
Yes, that is a double integral. Sometimes, it matters the order of integration.
I=∫y=cy=d[∫x=x1(y)x=x2(y)f(x,y)dx]dy
equation 1.37
You can reverse it as well, I’ll leave that to your imagination.
But, equation 1.37 needs a little elaboration. The inner integral treats y as a constant when x is being evaluated. And the outer integral then integrates y. If possible, try to express the inner bounds in terms of the outer independent variable.
The book covers an example on page 38 where a triangular region form (x,y)=(0,0) to x+y=1 is evaluated. This allows the inner integral to be expressed as a function of the outer independent variable.
I would consider a more difficult example where we are given a range for z where we then calculate the bounds of (x,y) to evaluate the integral over.
Additionally, we easily extend the notation for more independent variables:
∫∫∫Vf(x,y,z)dxdydz
equation 1.39
EXAMPLE
Evaluate the following:
I(x,y)=∬yxdxdy
A good point to make is that, we are looking for a solution that satisfies this condition
∂x∂y∂2fI(x,y)=yx
So, a particular solution can be as follows…
I(x,y)=∬yxdxdy=∫2yx2dy=2x2ln(∣y∣)
We ignored the constant of integration for a moment. Apparently to go from particular to general, you want to add in the constants… but they aren’t normal constants anymore. They can be entire functions, but only of one variable. That is how they would cancel out in the end.
∂x∂y∂2fc(x,y)c(x,y)=0=f(x)+g(y)
And therefore, we add in the constants of integration
The book evaluated both x and y, both equating to zero, but that is unnecessary.
Calculus of Variations
Calculus of Variations is an extremely important extension of the idea of finding local extrema of real-valued functions through the use of stationary points.
Looks like we are moving towards finding extrema, which are maxima and minima values, the idea behand calculus of variations. It is often useful to find a function f(x) that yields extreme values.
The example we are looking at is a rope tied to two points, A and B. With no external forces other than gravity, and the initial motion is at rest, the force of gravity acts on each part of the rope, which takes the shape where the total potential energy, expressed by the integral over all small segments of the rope, is minimal. We want to find the function y(x) that described the shape of the hanging rope with the minimal potential energy.
To introduce the calculus of variations, start with:
I=∫abF(y,y′,x)dx
equation 1.40
…where a, b, and F are given by the nature of the problem. The limits a and b of the integral are fixed, the correspond to the endpoints of the rope. Functions that take in other functions as their input and result in a scalar as their output are called functionals. That is, the argument of a functional is an entire curve. We say that I is a functional of y(x), denoted as:
I=I[y(x)]
equation 1.41
So, we use the square brackets to actually indicate that I is a functional, instead of a function of Rn. Think of I[y(x)] as a replacement for our use of f(x), but remember that they are technically different concepts.
A stationary point y(x) of the functional is a point where the functional I does not change if the y(x) is perturbed by a small amount. We look for the curves y(x) that are stationary value(s) of the integral I, and determine whether such curves are extrema of the integral. It is possible for the integral to have more than on stationary points.
Are stationary points then when the derivative is 0?
Let’s break down the concept of the calculus of variations in simple terms.
Imagine you have a path or a curve represented by the function y(x). This curve could be anything like the path of a ball flying through the air or the shape of a hanging cable. Now, we want to find a special point on that curve that has a unique property.
This special point is called a “stationary point.” A stationary point is a point where something interesting happens. In this case, it’s a point where the curve y(x) doesn’t change much if we make a small change to it.
To understand this, let’s think about a ball rolling down a hill. If you imagine the path the ball takes, it might go up and down, left and right, but it eventually reaches the bottom of the hill. At that bottom point, if you move the ball just a tiny bit to the left or right, it won’t really make a big difference because it’s already at the lowest point. That lowest point is like a stationary point.
Similarly, in the calculus of variations, we’re looking for a curve (represented by the function y(x)) where if we make a small change to it, the overall effect on the curve is very small. It’s like finding the path that is already at its best possible shape.
Now, the functional I[y(x)] is a way to measure how good a curve is. It’s like a scoring system for curves. If a curve gets a higher score, it means it’s better according to certain criteria. The functional takes into account the shape of the curve and how it behaves.
So, when we say that a stationary point y(x) of the functional I is a point where the functional doesn’t change if the curve is perturbed by a small amount, it means that at that point, the curve is already at its best shape according to the scoring system. No matter how you slightly wiggle or change the curve, the score won’t improve or change much because it’s already as good as it can be.
Finding these special points, these stationary points, helps us understand and optimise different processes in the world, like finding the shortest path between two points or the shape that minimises the energy of a system. It’s a powerful tool in mathematics and science to figure out the best solutions for various problems.
- ChatGPT
Ok, my confusion was around the statement saying that the functional doesn’t change. Actually, what we mean is that the functional I changes so little that it is not noticeable or insignificant when compared to the overall value of the functional. We represent change in the functional like:
y(x)→y(x)+ϵη(x)
equation 1.42
By our conversation, we require that I doesn’t change (much) if we move our stationary a tiny about, say ε, using any (sufficiently well-behaved) function η(x). We write something like:
dϵdI∣ϵ=0=0∀η(x)
equation 1.43
I believe that reads like the small changes in I, evaluated at 0, are equal to 0 for all tiny changes in our stationary point.
Now, sub that into our previous definition of the functional:
I[y(x),ϵ]=∫abF(y+ϵη,y′+ϵη′,x)dx
equation 1.435
I suppose we consider all functions to be well behaved, especially when considering situations related to physical examples. And I assume by well behaved we mean… continuous.
We then, for some reason, throw in an example of Taylor Series with Multiple Variables. It is like single variable, but where you basically perform polynomial expansion on higher order terms. And with little to no introduction to vectors, the book generalises the formula to any number of variables denoted by the vector x. Check out page 41 of the course text…
And then we push on to incorporate the Taylor series with our calculus of variations. We make some substitutions, and eventually get to the Euler-Lagrange equation:
∂y∂F=dxd(∂y′∂F)
equation 1.48
The importance of the Euler-Lagrange equation is that it can be used to find stationary paths of a wide class of functions in a standardised manner. What types of functions can we use this theory with? Those are functions that have the specified form
I[y(x)]=∫abF(y,y′,x)dx with y(a)=ya and y(b)=yb
It has bounds, and depends on the function y and its derivative, and even the main independent variable x. However, if there is a dependency on the second derivative or higher order derivatives, the theory cannot be applied. Additionally, you don’t need to depend on y, y′, or x all at once, but they can.
Functional Example
Can we look at an example? How about proving the shortest path between 2 points is a straight line?
Let’s start with 2 points A and B. They have points (x1,y1) and (x2,y2). But remember that we like things in terms of each other, so it’s more like… (x1,f(x1)), and same for the other. The wonderful Pythagoras’ Theorem states that c2=a2+b2.
For small / tiny segments of the path we will measure for the shortest distance, we can approximate the length with the distance formula ds=(dx)2+(dy)2, assuming that dx and dy are small enough to justify a useful approximation. We are then letting dy=f′(x)dx and factoring out the dx.
ds=1+y′2dx
Now, the total length of the line can be expressed as the sum of all of the tiny bits, which means as an integral:
L=∫abds=∫ab1+y′2dx
eq. 1.50
Remember, we kind of don’t know what y(x) is yet, it’s just a place holder still at this point. However, we want to calculate the path that leads to a stationary point for L. In this case, a minimum distance between A and B.
Let I[y(x)] be a functional that maps functions satisfying y(a)=ya and y(b)=yb to the real numbers. Any such function for which,
dεdI[y(x)+εη(x)]∣ε=0=0
…for all η(x) with η(a)=η(b)=0, is said to be a stationary path of I. This is kind of saying that if there is no perturbation (ε=0), we are looking for a local extrema, when a derivative is 0. Because the function is a curve, I suppose that is why we would call it a path instead of a point.
Let’s use Euler-Lagrange equation. Note, the function in the integral L does not explicitly depend on y. This conveniently implies that ∂y∂F=0. Write the Equation!
dxd(∂y′∂F)=0
So, for the derivative of a function to equal Zero, it must be a constant value! That’s interesting to know. Anyway, Looks like we do a little bit of literal substitution here:
c=∂y′∂F=1+(y′)2y′
That is the derivative of the function F=1+y′2 with respect to y′. Then, go through the painful algebraic steps of getting to
dy=1−c2cdx
A simple integration yields an equations similar to y=mx+b, but m is that mess with constant c.
Wikipedia has an article on Calculus of Variations also. There’s an article on the fundamental lemma of the calculus or variations which also sounds important.
Another example of how a functional can occur in practice is when trying to determine the length of a curve. A classical problem in the calculus of variation is to find the surface of minimal area that is generated by revolving a curve y(x) about the x-axis, where y(x) passes through two given points (a,ya) and (b,yb).
Another Example (From Lecture)
Suppose we have a function S[y(x)]=∫01y′(x)2+ydx. We are considering the functions has endpoints y(0)=0 and y(1)=2.
Can we find the stationary path of this function? Using the Euler-Langrange equation:
∂y∂FF(y,y′,x)=dxd∂y′∂FWHERE...=(y′)2+y
So, our function F actually does not have a specific x argument. Also notice how we say that F is the integrand of our functional. Lets explicitly define our E-L equation (keep reading if the solution is confusing):
∂y∂F=1
We are taking the partial derivative of F with respect to y, actually solving for that bit. For the purpose of our finding of the partial derivative of F with respect to y, we would view y′ as a constant. Derivatives of constants are equal to ZERO. And the d/dy=1.
Now we solve for the next bit on the right side of the equation:
∂y′∂F=2y′
Hopefully from above the reader can understand how we came to this solution.
Now, for the most interesting step, we must compute the total derivative of the expression we have determined.
dxd∂y′∂F=dxd(2y′)=2y′′
The left side of equation is said as “the total derivative with regard to x of ∂F by ∂y′.” When it comes to the total derivative, we must consider both explicit and implicitx dependencies. You can see that y′ doesn’t have an explicitx in it, so there’ no explicitx dependency. However, the y′ is in fact a function of x. Guilty by association because y(x) is a function of x. Because of this, we end up getting the second order derivative.
To Conclude…
1y′′=2y′′=21
Wow, so now to get the stationary path, we backwards solve from y′′ to y. You might notice that this is a differential equation, but a simple one so don’t panic. Just integrate twice.
y′y=21x+b=41x2+bx+c
Notice how those integration constants begin to pile up? No worries, because we know the boundary conditions, we can solve for them.
Very good! I think that helps a little bit with application of theory.
Summary
We looked as single variable function and multivariate functions, analysing rates of change with derivatives and integrals in a slew of poorly explained proofs and severe lack of examples. We also looked at Taylor expansion and the Euler-Lagrange equation.
We covered finding extrema of a single variable and of a functional. We did not look into finding extrema of multiple variable functions…
Check Yourself
Q: Is stating that f′(a)=0 and that f(x) has a stationary point at x=a the same as saying that f(x) has a local maximum or minimum at the point x=a? (Think hard).
A: No, these are not the same. Although all statements lead to the discovery that the tangent line is horizontal at x=a, the fundamental difference is the exclusion of the possibility of a saddle point in the latter statement.
Q: Is the antiderivative of a function unique?
A: No
Q: What is a double integral?
A: Apparently, over a region R can be approximated by filling R with small rectangles Ri and summing the volumes of the rectangular prisms with Ri and height bounded by the graph of f above Ri.