When you learn a new concept, chances are that there’s some sort of procedure to follow in order to come up with an answer to a problem. This helps students when they are first learning, because it lets them follow clearly laid out steps that will culminate in the correct answer. For example, if we were trying to add two fractions together, we know that a common denominator is needed. As a result, students might be told that they should multiply each fraction by the other’s denominator, which will guarantee that the denominators are the same. It might even be written in a nice three-step method like this:

  • Identify your fractions as a/b and c/d.
  • Multiply a by d and c by b.
  • Write the denominator as bd. Your new fraction should be of the form (ad+bc)/bd.

This is an easy-to-follow recipe that will produce the correct answer all of the time. It’s made so that you can go through the steps one line at a time and arrive at the answer you want. Students might even be encouraged to memorize this procedure.

I want to argue that this is a fundamentally flawed idea of how we should approach teaching students to solve problems.

First, let’s take a look at the example above. Imagine we wanted to add 1/2 and 3/4. The algorithm tells us that we should transform the two fractions in order to get 4/8 and 6/8, so that in total we get 10/8. This is the correct answer, but anyone who has taught this concepts knows that this isn’t the most efficient way to go about it. Instead, we notice that 2 is a multiple of 4, so we only have to change the first fraction in order to create common denominators. Doing so gives us 2/4+3/4=5/4, which is the same answer as above, but simplified. In fact, when doing these problems, students are usually required to simplify anyway, so the student would have to do more work after following the recipe.

This is a very simple example, but it illustrates an important point. Recipes blind students to shortcuts or other ways to solve a problem. Once a student is given a recipe, why should they look for a quicker way to do the problem? They know that the way they were given will work, so it usually isn’t worth the extra effort to look for a shortcut or another method. This is true even if the recipe takes longer to do! In this sense, I think we do students a great disservice if we only emphasize recipes.

It’s not that a recipe is bad. In fact, there are many wonderful recipes that solve many problems (more commonly known as algorithms in computer science and mathematics). However, the idea for those recipes is to feed them into a computer so that the computer can work on the result. They aren’t necessarily for students themselves to do. Plus, we don’t care if a computer takes a little longer to solve a problem (for the most part), because it can still do it fairly fast.

On the other hand, we want students to be able to solve all sorts of problems, and to use the techniques they’ve learned to tackle these problems. But recipes make students hone in on one way to solve a problem, without thinking about anything else. It offloads the thinking of a problem and reduces it to following predetermined steps. This could lead a student to completely miss the point of a problem, or to see an interesting connection if they weren’t strictly following an algorithm.

Some may scoff and say that seeing a better path to a problem rarely actually happens, but that’s because they aren’t trying hard enough. There are all sorts of tricks and techniques that one can use to solve problems without needing to go through recipes.

Another ripe example is the factoring of expressions, something that is the bane of many students. In secondary school, students focus on factoring quadratic polynomials, and there are a several cases that one has to consider in order to get the factorization just right. Since memory aids are allowed for students, these procedures are typically written down and some students may even analyze each expression to see which case it falls into.

This is a horrible way to go about learning factorization. It reduces the process to a classification problem, where students simply match their expression with the corresponding case, and follow the steps. This means that students will often miss shortcuts and other techniques that could be just as helpful to them as following the procedure!

The other problem with recipes is that they substitute knowledge for aptitude at following a procedure. Instead of knowing why the recipe works, the student becomes only responsible to use the recipe correctly. This encourages students to not even think about what they are doing, since they know the output is what they want. As a result, students aren’t thinking about the problem as much as going through the motions. This can lead teachers to thinking that a student knows why a concept works, when really the student can only tell you how it works.

My vivid example comes from learning how to complete the square in secondary school. The idea is that you want to factor a quadratic expression into something of the form a(x-b)2+c. When I first learned about it, the recipe seemed like magic (and not in a good way!). I had little idea about what was happening, but I did know that if followed the steps on my memory aid, I could solve the problem.

Did I look deeper in order to understand what was actually going on? Of course not. It was only years later that I looked at the technique again and saw that it did make sense and I should have understood it more when I was first introduced to it. However, since the recipe was there, I offloaded the responsibility of knowing in order to simply being able to use it.

I want to finish with the acknowledgment that recipes can be useful. They can speed problems up, particularly the one’s that are repeated over and over again (like finding the roots of a quadratic function). However, you should always ask yourself, “Could I work from first principles to get back to this point?” If the answer is “yes”, then you have understood the concept. If it’s “no”, then you should investigate that uncertainty! It’s an opportunity to learn. This is why I always try to ask the students I work with conceptual questions, because if the point of education is just to follow recipes, A.I. will replace us much sooner than we might want.

Intuition about Ideals

When studying rings in abstract algebra, one also learns about subrings. They are pretty much exactly what you would expect: subsets of a ring with the same operations defined on this subset. However, a more interesting type of ring is an ideal.

Definition: An ideal $I$ of a ring $R$ is a subring with the special property that, for any element $a \in I$ and any element $r \in R$, $ar \in I$ and $ra \in I$.

Also, note that if we have a commutative ring, then you only need to check one of the cases above. So that’s the formal definition. It’s a little clunky, but there’s a nice intuition behind it.

An ideal “absorbs” the elements that it comes into contact with. In other words, any time an element in your ring $R$ comes into contact with the an element of the ideal, it becomes part of the ideal. (I kind of want to give a zombie analogy, but I’ll let you fill in the details for now!)

On its own, this isn’t particularly interesting. So what if ideals absorb their elements?

The real magic requires a bit more theory. First, we can have a particularly kind of ideal, called a principal ideal, which is an ideal of the form $\left< a \right> ={ ar: r \in R, a \in I }$. This simply means that the element $a$ “generates” (or is a factor of) every single element in the ideal. For a quick example, if we consider the ring of integers (which is just the integers with addition and multiplication defined on them), $\left< 2 \right>$ is a principal ideal of the integers, which consists of all the even integers. No matter what integer you multiply $2$ by that isn’t in $\left< 2 \right>$, the result will be in the ideal.

We can now consider the factor ring $R/ \left< a \right> = { x + \left< a \right>:x \in R }$. I’m going to avoid talking about classes and equivalence relations here, so instead, I’ll describe the idea behind these factor rings. Essentially, the factor ring above is the set of all elements in $R$ such that any “factors” of $\left< a \right>$ are taken out. In other words, we are taking the elements $x \in R$ modulo the elements that are of the form $ay, y \in R$.

If we go back to our example with the ring of integers and $\left< 2 \right>$, we can consider the factor ring $\mathbb{Z} / \left< 2 \right>$. What are the elements which are part of this set? Well, the elements of $\left< 2 \right>$ are all of the even integers. Therefore, the elements which are left in our factor ring can’t have any factors of two in them. Furthermore, they can’t have factors of two in them as a result of division with remainder. To see this explicitly, consider $65$. This number is odd, so it isn’t divisible by two. However, $65 = 32 \cdot 2 + 1$. As such, the $32 \cdot 2 \in \left< 2 \right>$, so the ideal “absorbs” it. Operationally, that means it disappears, so we are simply left with $1$.

If you continue with this factor ring, you will see that the only numbers we can be left over with is $-1, 0$, and $1$. We can then remove $-1$ by simply adding $2$ (which is equivalent to a zero in our factor ring). What we end up with is precisely the integers modulo two.

As you can see, we these special ideals act as absorbers that take in certain elements and remove them from the ring. Why do we want to remove elements? Well, a particular type of ring that is useful to work in is a field, where all the nonzero elements are units and there are no zero-divisors. Creating a factor ring by “dividing” out certain elements generated by an ideal can give us a field. In fact, there’s a theorem called the first isomorphism theorem of rings which accomplishes precisely this function, but that’s for another time.

Why Can't We Reach the Speed of Light By Boosting?

If you have ever come across someone talking about special relativity, there’s a good chance you will be able to tell me one of the two fundamental axioms in the subject: the fact that the speed of light is constant in all frames of reference.

After thinking about this for a few moments, one might come up with a thought experiment that looks like a counterexample. Imagine (for the sake of the experiment) that there’s a train moving along at a constant speed (we’ll call it v) along the ground with respect to you, the unmoving person. Then, imagine that there’s another train on top of the first train, and its moving at a speed v, but with respect to the first train. As such, you would undoubtedly agree that the second train seems to be moving faster than the first train, since it has the speed of the first train and its own speed.

Following this argument, it seems reasonably straightforward that if you keep on stacking trains on top of each other such that they all are moving relative to the last one with speed v, at some point, no matter how slow the speed v is, one of the trains should exceed the speed of light. Bingo, we’ve done it!

Unfortunately, this is not so. But what is the actual problem here? Why can’t we get past the speed of light?

To answer this, I’ll have to introduce a few things. But first, some notes. One, we are trying to see why the speed of light acts as a “cosmic speed limit”, so we aren’t going to simply say, “Let the speed of the first train be faster than the speed of light.” Instead, we want to see why we can’t build past it. Second, I want to note that this isn’t just a strange situation that has no applications. If you don’t like the scenario with the trains, imagine continually boosting oneself to a faster and faster speed. Of course, this has to still be done with the right Lorentz transformation, but you will hit a barrier.

With that out of the way, let’s dig into the problem. We could try and do successive Lorentz transformations in order to get the relative velocity between the observer on the ground and the nth train. However, the fact is that this isn’t the “natural” or easy way to do the problem. In special relativity, velocities don’t add like we are used to. Instead, there’s a fancy factor γ and several other differences when transforming velocities. However, there is a quantity that does simply add when two velocities are composed. It is called the rapidity, and is denoted with the letter φ. It is defined as:

Here, β is just the ratio of the speed (that any train is moving at with respect to the one underneath it) and the speed of light. For our problem, this value is constant. We want to know how fast the nth train is moving, so we just have to keep on adding the rapidity factors. But they’re all the same! This means we get the rather simple result that the rapidity for the nth cart is n times the first train’s rapidity factor. We then use the definition for φ from above in order to get:

We could stop here since everything in the above expression is a constant that we would input, but we don’t want to know what happens for the nth cart. We want to keep boosting forever, until we get past the speed of light! This means we need to take a limit as n tends to infinity. Therefore, it would be nice to have our above expression in a form that we can take a limit much more cleanly. I’ll spare you the gory details, but basically we can use the identities for hyperbolic functions that allow us to go from them to exponential and logarithmic functions. After doing this, you should be able to get something like the following:

This is actually a nice expression, because its limit is very easy to evaluate. Before this though, let’s look at plot of this function as a function of n. In other words, we want to see the behaviour of this function as n gets larger.

Plot of the function defined above in terms of n.

As you can see, the function gets closer and closer to one as n increases. Why isn’t it the speed of light c? Remember what β is. It’s the speed of the train divided by the speed of light, so having the plot asymptotically go towards one is what we should expect. Also, if you’re curious, the value for β1 is 0.5, which is quite a large value. This is why the function asymptotes to one very fast.

Intuition and Rigour

I’ve always been a bit wary about intuition. For a long time, I would avoid using the term, because I found it was creating a dangerous habit in terms of thinking clearly while solving problems. Your intuition isn’t always right. As such, instead of trying to guess when your intuition is correct, my advice was simply to throw it out, and work with what you have. This seemed like a much easier and straightforward way to go about reasoning, particularly within mathematics.

However, as I’ve reflected on this matter further, I’ve come to realize that I actually do like using intuition, but not in the way that some people do. To see exactly what I mean, consider the classic example from calculus. We’ve all heard professors (particularly, if they aren’t teaching mathematics, but subjects in the sciences instead) say things like we should just “divide by $dt$ on both sides of the equation”. The problem with saying this isn’t that it works, but that it hides the reason why it works. There’s a reason our notation allows for this kind of manipulation, and it’s built right into the definition of a limit and how the derivative functions. It’s not a happy coincidence that we can then “divide” by $dt$ and get the correct answer. It really has to do with the way our mathematics is set up. I feel like the danger with this kind of “intuitive” manipulation is that students won’t remember the actual reason, and this causes problems down the road.

In my mind, the crux of the problem is captured by the following question. Is your argument for why something is true based upon your intuition? If so, then that’s a problem. It’s not the intuition itself that’s problematic. It’s the fact that you’re using that intuition as a pseudo-proof for why you think something is correct. If we want students to be more confident in their ability to prove things are true, then I think it’s worth it to highlight this distinction and make it implicit.

Furthermore, this change in perspective allows me to capture what I wanted to get out of intuition. In my mind, intuition is what lets you get a foothold into a new topic. By itself, you only get a rough idea of what is going on, and perhaps the direction an idea will take you. But the intuition is only there as a starting point. It’s not supposed to carry you through the whole concept. I like to think of it as the “one-sentence summary” of an idea, without the jargon.

This doesn’t mean that intuition is useless. In fact, it can be difficult to come up with these ideas, even if you know the topic well. That’s a challenge that I’ve been trying to work on, because I want to be able to explain an idea at its essence, without the extra baggage. That’s what I would argue intuition is for. Use it as a foothold into a new topic, but never forget that it’s only that. Don’t be lulled into thinking your intuition will carry you through, because mathematics has a lot of unintuitive surprises!