Linear versus Cyclic Permutations

One aspect of probability I’ve always found to be a little tricky is the part where you need to count things. In theory, this sounds easy enough. After all, it’s just looking at the complete list of things you’re studying, and enumerating them, right?

Well, we know that things become much more subtle when you have a big number to count and you have to be careful when using the “tricks” of multiplication in order to avoid duplication. This is the part that has sometimes seemed straightforward, while at other times being totally mystifying.

If you’ve taken an introductory probability class, you’ve undoubtedly come across the notion of permutations. This concept simply deals with answering the question, “How many ways can I order these items I have?” Just to briefly review, let’s look at an example. Suppose I want to know how many different ways I can order four items: a, b, c, and d. The answer is given by $4!$, which can be seen below.

This makes sense, and it’s usually the image we have in mind when thinking about permutations. This readily generalizes to $n$ items, where the number or permutations is $n!$.

But here’s a question. What are the possible orderings of the four items from above if I arrange them in a circle like this?

Suddenly, the idea has shifted. Now, the absolute placement of the object doesn’t matter (eg. Is it first, second, third, or fourth?), but it’s placement relative to the other objects matters. If we look at object a, the only thing that matters is that it is in between objects $d$ and $b$. How the circle is oriented doesn’t matter. As such, these scenarios are now equivalent.

The question now becomes, “How do we address this change in possibilities within our expression?”

If we consider the circles I drew above, you may notice that there are four of them. This is not a coincidence. The reason is that if we have a certain configuration of the circle with $n$ objects comprising the circle, we can rotate the circle $n$ times without changing the relative positions of the objects. This is because the circle possesses rotational symmetry. As such, the number of ways you can permute the four objects is the the number of ways we can permute them when they are in a line, divided by $n$. Mathematically, this looks like: Here, I’ve used the symbol $P_{cyclic}$ to represent the permutations that can be made in a cycle.

This wasn’t meant to be a long post, but it’s something that I thought was interesting, since (without thinking about) we usually only consider the linear case when looking at permutations. I wanted to show here that circular permutations are just as easy as regular permutations.

Period of a Pendulum

The pendulum is a classic physical object that is modeled in many introductory physics courses. It’s a nice system to study because it is so simple, yet still allows students to see how to study the motion of a system. Here, I want to do through the steps of deriving what is usually seen as an elementary result, the period of a pendulum, and show how it is actually more complicated than what most students see.

To begin with, what exactly is a pendulum? This may seem like an easy question, but it’s a good idea to have a well-defined system. So, the pendulum we will be looking at today is called a simple pendulum. Surprising no one, a simple pendulum is the most idealized pendulum, consisting of a point mass attached by rod of fixed length. This means we aren’t dealing with a pendulum that has a flexible rope that changes length, nor do we have something like a boat, which doesn’t quite act like a point mass since the mass is distributed throughout the object and isn’t localized. In other words, our situation should look something like this:

You may be wondering why we aren’t using Cartesian coordinates, and the reason is quite simple. In Cartesian coordinates, we would need to specify both the $x$ and $y$ coordinates, which requires two degrees of freedom and is also a pain in this particular setup. By contrast, using polar coordinates is more compact since the radius $r$ is fixed (in this case, $r=l$), which means we only have one degree of freedom, the angle $\theta$.

To begin our analysis, we will start with our generic equation for conservation of energy, which looks like this:

Here, the kinetic energy is $T$ and the potential energy is $U$. To know the kinetic energy, we need to know the magnitude of the velocity of the object, which we don’t know at the moment (and which changes depending on the angle $\theta$). We do know though that the kinetic energy is given by $T = \frac{1}{2} m v^2$, where $v$ is the magnitude of the velocity (the speed), so we will keep that to the side.

We also know that the potential energy is given by the familiar equation $U = mgh$ on Earth, where $h$ is the height of the object from the ground. To find this height $h$, we need to draw some clever lines and invoke some geometry:

From the diagram above, we can see that the height is given by $h = l \left( 1 - \cos \theta \right)$. Therefore, the potential energy is:

With this, we almost have everything we need in our equation. The goal is to isolate for our speed $v$, so we can then integrate it over a whole cycle to find the period. To do this, let’s remember our conservation of energy equation: $E = T + U$. This equation states that the total energy $E$ is always a constant in time. In other words, $\frac{dE}{dt} = 0$, and so we can simply find the total energy at one particular instant, and then substitute it for $E$.

What we will do is consider the energy that the pendulum initially has, just before it is allowed to fall. At that moment, it has an initial angle which we will call $\theta_0$, and since it isn’t moving, the pendulum has no kinetic energy. Therefore, the energy of the pendulum is:

We can now make this equal to the sum of the kinetic and potential energy at any time to get:

Since each term in this equation has the mass $m$ in it, we can see that our result will be independent of the mass. If we then isolate for $v^2$, we get:

At this point, we need to think about what the speed $v$ is. The definition of speed is $v = \frac{ds}{dt}$, where $s$ is the path length. Fortunately, the path length of a pendulum is very easy to find, since it’s simply the arc length of a circle!

From the above diagram, we can see that the path length is given by $s = l \theta$. Therefore, the speed is:

We can now substitute this into Equation \ref{vSquared}, and solve for $\frac{d\theta}{dt}$:

Note here that I’m only considering the positive solution for $\frac{d\theta}{dt}$, since we will be solving for the period, which is a positive quantity. What we will now do is employ the method of separation of variables to integrate this quantity. If you aren’t familiar with this method, I suggest taking a look at a resource on differential equations such as here. Separating our variables gives us:

This is good. We now have an expression for $dt$, which means we can integrate it for the angle between $0$ and $\theta_0$, and this will be one quarter of the period. To see why it’s only a quarter of the period, look at the following sketch (each arrow is a quarter period):

Integrating gives us:

And solving for the period $T$ gives:

This is the full expression for the period of a pendulum at any initial angle $\theta_0$. The only slight issue is that, while correct, this expression is not easily integrated. In fact, I don’t know how to integrate it at all. What we would like the period to be is of the form:

The expression above would be what is called a Taylor expansion, with the first term being what you might have already seen to be the period of a pendulum, plus some correction factors that are contained in the ellipsis. To get it into this form, we want to be able to use the binomial expansion, which is given by:

To do this, we need to transform Equation \ref{fullPeriod}. First, we will perform what may seem like a totally random substitution, but bear with me. We will change coordinates and go from $\theta \rightarrow \psi$. This mapping will be done using the following relation:

Looking at this relation, we can see that when $\theta$ ranges from 0 to $\theta_0$, the corresponding variable $\psi$ varies from $0$ to $\pi / 2$.

Implicitly differentiating each side gives us:

We can then pull out a handy trigonometric identity called the double angle identity, which is given by:

Using this identity, we can rewrite the expression inside the square root of Equation \ref{fullPeriod} as:

From here, we can insert our original substitution from Equation \ref{transform} into the second term above, giving us:

Just to note, from the second to third line, I simply used the Pythagorean theorem. Now, since we wanted the square root of $\cos \theta - \cos \theta_0$, we can take the square root of the above expression. Furthermore, we can use Equation \ref{dTheta} in order to find an expression for $d \theta$:

From this, we can insert everything into the integral of Equation \ref{fullPeriod} and simplify. Note here that I’ve omitted the prefactor in the front of the integral just to get things a little cleaner, but we won’t forget about it.

We’re almost there. Now, we can simply used a rearranged version of the Pythagorean theorem to write:

Here, I’ve made use of equation (13) again in order to write this expression in terms of $\psi$. Throwing this all together and reintroducing the prefactor in front for the period gives us the following result for the period:

I don’t know about you, but that was a lot of work. This integral is actually a special kind of integral. It’s called a complete elliptic integral of the first kind, and is defined by:

In our case, $m = \sin^2 \left( \frac{\theta_0}{2} \right)$. What’s nice about this form of the integral is that it is indeed in binomial form, so we can expand it. We therefore have:

This looks like quite the jumbled expression, but we can can write it quite succinctly in the following form:

Here, the double factorial sign (!!) means that we skip a number each time we do the multiplication. Therefore, $5!! = 5 \cdot 3 \cdot 1$ and $6!! = 6 \cdot 4 \cdot 2$. You can verify that this does represent the above expression of Equation \ref{long}. We are now in a better position to evaluate the integral. It looks like this:

This last integral is a bit of tricky one, but we will show that the integral is given by:

To get this result, we will use recursion. First, we note that the values of $n$ are all positive, which is clear from Equation \ref{sum}. This means our lowest value of $n$ will be one. If we label the integral in Equation \ref{In} as $I(n)$, then we can evaulate this function to get:

With the base case out of the way, we now tackle the whole integral. Let’s start by splitting up the integral as such:

We can now use integration by parts to partially evaluate this integral. If we use $u = \sin^{2n-1} \psi$ and $dv = \sin \psi$, we get:

The first term evaluates to zero, and so we are only left with the integral. We can then change the cosine into a sine and rearrange things to give:

If you look at this and compare it to our definition of $I(n)$ from Equation \ref{In}, you’ll notice that we can write the above equation as:

Solving for $I(n)$ gives:

This is a recurrence relation, which means it tells us how to construct the next term from the previous one, as long as we have a beginning “seed”. Thankfully, we do have one, which is $I(1) = \pi/4$.

What we want to do at this point is to keep on applying the recurrence relation to the term $I(n-1)$, until we get all the way to $I(1)$, where we stop. I’ll illustrate a few of these for you, and hopefully it becomes clear what the pattern is.

I could continue, but this is a good representation of what happens. In summary, the numerators of the fractions are odd numbers (since they are in the form $2k+1$), and the denominators are even numbers (since they are in the form $2k$). Furthermore, as you go down the fraction, you go from an odd number to the next closest odd number, and the argument is the same for the even numbers. Therefore, what we are really doing is another factorial all the way until we get to $I(1)$, which we can evaluate since it is our starting seed. Therefore, we get:

Now that we have this result, we can put it all together to give us:

Expanding this gives us the following infinite series:

If we recall that $m = \sin^2\left( \frac{\theta_0}{2} \right)$ and we insert the prefactors for the period from Equation \ref{Period} in, we get the following result for the period of the pendulum:

This is the full expression for the period of the pendulum with any starting angle $\theta_0$. What’s quite nice about this expression is that we can immediately see that if $\theta_0 \approx 0$, then all of the sine functions become very close to zero and so the only important term in the square brackets is the one. At this point, the period becomes what one usually learns (for small angles): $T = 2\pi \sqrt{\frac{l}{g}}$.

Furthermore, we can see that when our initial angle gets bigger, it becomes more important to take on successive terms within the brackets of Equation \ref{Final}.

Hopefully, this wasn’t too bad. I wanted to go through the calculation as explicitly as possible, since I remember being a bit confused when I saw it for the first time. As such, I want to make sure things are illustrated nice and slow so everyone can follow.

What I love the most about these long analytical expressions is how you can recover the simpler result you had from simplifying the problem. We can easily see that our “usual” period is nestled within the long infinite expression. Lastly, I just wanted to make clear that one assumption we did make was that we were dealing with a point mass pendulum. In other words, we still weren’t quite modelling a physical pendulum, which requires taking into account the centre of mass of the bob and the rod of the pendulum together. Still, this is enough precision for today, so we will leave it at that.

On Uncertainty in Science

I’ll let you in on a bit of a secret. For most of my life, I hated doing experiments in science.

It didn’t really matter if the experiments were in physics, chemistry, or biology class (though I enjoyed the fact that physics experiments tended not to be as messy). In fact, when I was in secondary school, my grade was asked at the end of the year to vote on what kind of science class they wanted the next year. There were two choices. One was to keep the material more theoretical and from the textbook. The second was to introduce the content in a much more “hands-on” sort of way, which meant more laboratory experiments. If I recall correctly, I was one of the only students who chose the first option.

I didn’t really understand why everyone wanted to do the hands-on program. In my eyes, it just made things seem less exact and more messy. Other students seemed to like the idea that they could do experiments, but it wasn’t my idea of a fun time.

Moving into CÉGEP, I kept this attitude of not enjoying lab experiments. They were annoying to do, and completing the lab reports after were the worst. One had to deal with uncertainties and significant figures and sources of error that made everything seem much more messy than the theoretical predictions that were made using mathematics. I longed for simple relations without error bars.

From reading the above, it may seem like I think science should be all theoretical. Of coure, this is not the case, and I think, if anything, we need to talk more about the uncertainty and messiness in science. If we want to have a society that understands the way we get results in science, we need to communicate this uncertainty more clearly.

Science is not mathematics. Sure, we want to describe the world using mathematics as our language, but we need to keep in mind that nature will not bend to our will. There will always be fluctuations, imprecise measurements, and sheer randomness in some data. We use mathematics to make these uncertainties as small as possible, but we can never fully eliminate them. As such, it’s crucial to realize that a measurement means nothing without its corresponding uncertainty. The reason is simple: we take measurements in order to compare them. If we just dealt with measurements as precise quantities that have no uncertainty, than we would find a lot less agreement with our predictions. This would make it near impossible to do science.

Let’s take a very simple example. Imagine we wanted to measure an object that is said to be 4.500 metres long. To verify this claim, we take a metre stick that has granulations every centimetre and measure the object. Say it comes out to 4.52 metres. Do we say that these two measurments are different?

The answer is, it depends. To find out for sure, we need to know the uncertainties that are associated with each measurement. When the object was measured to be 4.500 metres long originally, what were the uncertainties on that measurement? Was it $\pm \ 1 mm$? These are critical questions to ask when making comparisons.

If we imagine that the metre stick has an uncertainty of $\pm \ 1 cm$ (because this metre stick is only marked off in centimetres), then the two values we are comparing are: The question now becomes: do these two measurements overlap? This is the key question, and in our case, the measurements don’t overlap, since the first measurement could be at most 4.501 m and the second measurement could be at least 4.51 m. Since these two measurements don’t overlap, we consider them to not be in agreement.

As you may notice, this isn’t a trivial matter. It may have seemed like the two measurements did agree at first glance, but without knowing their associated uncertainties, we have no idea. This means that if someone tells you some figure that came from experiment and wasn’t just a theoretical calculation, you need to know their uncertainty if you want to compare the figure to anything else. Without it, the measurement is meaningless.

What I want to stress here is that uncertainty is inherent in science. There’s no getting around this fact, no matter how precise and careful your experiment is. This is why I find it so amusing when people attack scientific results on the basis that they are simply uncertain. Of course they are! This isn’t mathematics, where results have infinite precision. In science, we have this inherent uncertainty, but we use the tools of mathematics to make sure that the uncertainty is as small as possible, and we make our claims using this uncertainty. We make do with what nature presents us.

If there’s one thing I want to ask of you, it is this: make sure you’re aware of the inherent uncertainty in science, so that you aren’t worried when you see scientists saying that the measurements agree with theory, despite the seeming non-equivalence. Chances are, the uncertainties in the measurement is what allows scientists to make this claim. Conversely, look for those who try to exploit this characteristic of science to push information that simply isn’t supported by the scientific method.

Mathematical Sophistication

When I reflect on my education in science (and in physics in particular), the common theme I see is just how the amount of sophistication present in the computations and concepts I learned each year kept increasing. If there was one thing I could count on, it wasn’t learning something “new”. Instead, it was about viewing things I might have once taken for granted as a process that was much more deep than I realized.

For example, take Snell’s law. In secondary school, I learned how this phenomena worked in the sense that I could calculate the effects. I learned that Snell’s law could be written like this: This allows one to calculate the angle of refraction for various simple systems, and this is exactly what I remember doing. Additionally, the “reason” for why this was true seemed to be something about the light “slowing down” in a different medium, but the reasoning wasn’t all that clear. In the end, it was more of a “here’s the law, now calculate it” sort of concept.

At the time, I don’t remember being bothered by this. Now though, it makes me frustrated, since what is the point of learning these ideas if one doesn’t learn why this specific result occurs? It’s something I’ve been thinking about a fair amount lately.

Fast-forward a few years, and now Snell’s law gets derived using Fermat’s principle of least time, which uses the calculus of variations, and gives one a more satisfying explanation concerning what is going on when the light rays “bend”. In this sense, the mathematics produce the result, which is better than being told the result.

Another example is one that I hadn’t thought about much until I came across it. Anyone who has gone through a class in statistics has seen how to fit a curve to a collection of data points. Usually, one is concerned only with fitting a linear curve, but sometimes we also plot quadratic curves as well (with software).

In the case of linear plots, in secondary school, the recipe went like this. First, one should plot the points on a graph. Then, one needs to carefully draw a rectangle around the data points, and then measure the dimensions of this rectangle. From there, the slope can be calculated, and then a representative point was chosen in order to find the initial value of the line. Basically, this was an exercise in graphing and drawing accuracy, not something you’d want from a mathematics class. As such, while the results were qualitatively correct, they coud differ widely from student to student.

Fast-forward a few years later once again, and the story is much different. In my introductory statistics for science class, we were given the equation that would give us the slope of our linear equation, as well as the correct point to use for the initial value. This undoubtedly produced more accurate results, but once again it lacked the motivation behind it (due to a lack of time, in this case). Thankfully, this lack of explanation was addressed in my linear algebra class, where we learned the method of least-squares. Here was finally an explanation as to how these curves were computed. In the statistics class, it was a long and complicated formula that was given. However, in linear algebra, the reasoning behind how to compute such a curve was much simpler and straightforward. In other words, it made sense as a process. Even better, this method generalizes well for other types of curve fitting, not just linear functions. As such, this explanation was much more useful than all of the other ones.

The lesson that I personally get is that, no matter the topic you’re learning, there often is another layer of understanding that can complement it. This means that I shouldn’t stop looking at concepts that I’ve seen many times just because I think they are boring! There are often new perspectives to look at the situations, and they usually come tied with more mathematical sophistication. This is something that I love to see, because it brings new viewpoints to concepts I might have though I had completely figured out. This shows me that I can always learn and understand a concept more thoroughly, and hopefully this can be good inspiration for you to seek out varied explanations of your favourite concepts.

Just because classical mechanics is, well, classical, doesn’t mean you can’t look at it in more sophisticated ways.