Minimization (The Rope Cutting Problem)

If you’re in secondary school, chances are you’ve had to discuss a bunch of properties of functions. These include finding the maxima and minima, the roots (zeros), the intervals where the function is positive or negative, the intervals where the function is increasing or decreasing, and the domain of the function. While having to slog through page after page of this, you might be tempted to ask, “Why do I have to learn about this? Not only is it boring, it’s completely useless!”

Trust me, I hear you. The problem is that, through doing countless exercises of the same flavour, you gain proficiency, but you don’t learn why you should have this piece of information. To remedy this, let me give you a problem where knowing the properties of a function are quite useful. I came across this problem while watching some videos by the great mathematics teach Eddy Woo, and I think if you’ve been wanting something a bit more than what you’ve done in your classes, this will be interesting to you. I’ll state the problem, and then I’ll explain the solution. As such, if you want to try the problem beforehand (which I recommend!), stop reading after I state the problem.

Problem: Imagine you have a one metre rope. Your task is to decide where to cut the rope in order for the two pieces of rope to make a square and a circle. Where should the cut be made in order to minimize the combined area of the two shapes?


Solution:

Begin by choosing a variable to represent the problem. For the square, we have the potential variable $s$, which is the side length of the square. For the circle, we have the natural variable $r$, which represents the radius of the circle. Either variable will work, but only one is needed, since we can express one variable in terms of the other. For this solution, $r$ will be used.

We want to minimize the total area of both shapes, so we need a function that gives us the area. Since we know what the area of a circle is in terms of $r$, and what the area of a square is in terms of $s$, we get the following area function:

Obviously, this has two variables in it, $r$ and $s$. However, as we mentioned above, we can express one variable in terms of the other. This is because the variables $r$ and $s$ can’t be anything. They are constrained by the fact that the rope given in the problem is only one metre long. In other words, the length of the rope has to be the sum of the perimeters of the two shapes, since that’s exactly what the rope will be used to create. Therefore, we get the desired constraint.

Now that we have the variable $s$ expressed in terms of $r$, we substitute this into Equation (1).

The question now becomes, “Where is this a minimum?” To answer this, you might wonder if there is a minimum, but after a moment’s reflection you should recognize that Equation (3) is the equation of a parabola with a positive coefficient for the $r^2$ term, meaning the parabola makes a “U” shape and so does have a minimum. To find the minimum, we will use two different strategies.

The Calculus Way

If you know a bit of calculus, you might be tempted to take the derivative and set it equal to zero (since a parabola only have one point where the derivative is zero). Doing so gives the following.

This gives us the radius at which the total area is a minimum. Therefore, if we want to know where to cut the rope, we want to know the circumference of the circle (since that will indicate where the cut should appear). This means the we need to cut the rope at $2\pi r_{min} = \frac{2\pi}{8+2\pi} = \frac{\pi}{4+\pi}$, which is approximately 44% of the rope.

Completing the Square

The second way to solve this problem once we have Equation (3) is to complete the square. Doing so will transform Equation (3) into the “standard form” of a quadratic equation, which looks like $f(x) = a(x-h)^2 + k$. From there, we know that a quadratic equation has a minimum or maximum at it’s vertex, which is given by $(h,k)$. Therefore, all we need to do is find the coordinate $h$, and we’ve got our minimum. To do this, we will just label the coefficients of Equation (3) as $a$, $b$, and $c$, so we get the following:

To complete the square, we first have to factor the $a$ from both the $r^2$ and $r$ terms, giving:

Now, we simpy write what is in parentheses as a perfect square, and subtract off the extra part. This gives:

Cleaning up these terms brings us to the standard form:

If we recall what our coefficients were, we see that $a = \frac{16\pi + 4\pi^2}{16}$ and $b = \frac{-4\pi}{16}$. From Equation (8) above, we know that our coordinate $h$ is $-b/2a$ (remember that the standard form of the equation has $-h$ in it), so we get:

Once again, if we want to find the spot to cut the rope, we need to multiply this result by $2\pi$ (to get the circumference of the circle). Doing so gives us once again $\frac{\pi}{4+\pi}$, which is about 44% of the rope.


Hopefully, you were able to solve the problem. I really like this problem because you need to come up with your own variables and you need to know how to find the minimum of a function, which are things you might practice a lot, but never really use in a problem. As such, I wanted to give you a taste of what these function properties can actually help us with.

Happy problem solving.

Machines and Processes

When you first learned about algebra, chances are you learned about something called a function, typically one that looks like this: This is nothing more than the equation of a straight line. You probably also learned how this could be represented as a graph (which is why you know it’s a straight line). This was simple enough, and you soon learned how to deal with different kinds of functions. These include quadratics (parabolas), exponentials, rationals, and a host of other functions. You learned what these looked like when graphed, and how to find various properties of these functions. This includes finding the roots of the equation (when $f(x)=0$), finding the domain and range, and characteristics of when the graph is increasing or decreasing.

To find some of these properties, you actually had to interact with the function. That meant working through the algebra and manipulating some equations. Depending on how comfortable you were in mathematics, this could be easy or difficult. However, assuming you got past this stage and still understood what was happening, you then got to more exotic functions, such as logarithmic or trigonometric functions. These have the form $f(x) = \log(x)$ and $f(x) = \sin(x)$, and aren’t your usual functions. This is where you start seeing students making the mistake of dividing by $\sin$ or $\log$.

Why does this happen? From my experience, it’s due to the sense that everything in mathematics is linear. To illustrate this, let’s look at an easy example. Suppose we have the equation $\log(x+1) = 2$. You might be tempted to say that this is the same as $\log(x) + \log(1) = 2$, but this would be incorrect! What we actually have to do is convert the equation into exponential form, giving us $10^2 = x+1$. The answer itself isn’t really important. What’s important is that the logarithm isn’t linear, and so you can’t simple distribute it onto the term $(x+1)$.

From what I can gather, the reason this happens is that $\log(x+1)$ is very similar in notation to $4(x+1)$. We know that the latter is equivalent to $4x+4$, which we get by multiplying each term inside the parentheses by $4$. As such, it isn’t so surprising when students think that the same should apply to these new things called logarithms. The same is true for the expression $\sin(x+1)$ and $(x+1)^2$. We feel like it’s completely natural to write $\sin(x)+\sin(1)$ and $(x^2+1^2)$, but this is incorrect.

Instead, what we should think of the expressions like $\sin(x)$ and $\log(x)$ as functions or machines. When you insert a number $x$ into them, the machine runs, and spits out a number back. The crucial part is that the machine is highly dependent on the initial number you put in. Said differently: if I put $(2+5)$ into $\sin(x)$, or if I do $\sin(2)+\sin(5)$, I get to very different answers. For comparison, the former gives approximately $0.065699$, and the latter gives $0.909297-0.958924=-0.049627$. Obviously, these two numbers aren’t the same. Also, for those who have studied trigonometric functions, you know that $\sin(x)$ varies from $-1$ to $1$, which means that $\sin(x+y)$ must also vary from $-1$ to $1$, while $\sin(x)+\sin(y)$ could vary from $-2$ to $2$, which means these two expressions can’t be the same.

This is also true for logarithms, and many other mathematical functions you may encounter. The difficult at this point is to deal with this while working with algebraic equations. You can’t simply add, subtract, multiply, and divide your way to a solution. You have to know what functions you’re dealing with, and how to work with them.

Last example: consider the function $f(x) = 4\sin(x^2-2) +1$. This function has a lot going on, but the way I think about it is that you have a function within a function within a function. To make this explicit, label $g(x) = \sin(x)$, and $h(x) = x^2-2$. Remember that the variable $x$ in both of these equations is just a placeholder. You can stick anything there. If we think back to our analogy of a machine, think of $x$ as where you input your value into the machine. Now, if we write $f(x)$ in terms pf $g(x)$ and $h(x)$, we get the following: This equation may look a bit busy, but that’s the point! I want you to really see where $x$ is located. It’s nestled into the “deepest” function, $h(x)$. What I want you also see is that instead of writing $g(x)$, I wrote $g(h(x))$. This means that instead of sticking $x$ into the input of $g(x)$, I stuck $h(x)$ instead! There is nothing wrong with this, and it actually has a fancy name. It’s called a composition of functions. Think of it like a machine within a machine. The output of the first machine is connected to the input of the first machine, so that when you give the first machine an initial value, it passes through both machines. As such, you can’t exactly “split up” the machine into two different parts and expect to get the same answer as doing both together. It depends on the nature of the machine.


Hopefully, this gives a bit of intuition into the idea of more complex functions such as the logarithmic or trigonometric functions. They don’t distribute linearly, and so you can’t apply your usual rules of algebra to them. Additionally, it’s important to remember that expressions like $\log$ and $\sin$ by themselves have no meaning. They aren’t numbers. Typing these into your calculator and pressing the “=” buttom gives an error, and rightly so! Remember, they are like machines or processes, so asking what the output of a machine is without any input doesn’t make sense.

By keeping in mind the analogy of functions as machines, you should have a better conceptual understanding of why logarithms and trigonometric functions don’t simply distribute, and this will translate to understanding how to manipulate them without mistakes.

Notation

In mathematics, notation is simultaneously everything and nothing. It isn’t difficult to imagine another alien species havig the same notions of calculus as we do, but without the symbols of integration or differentiation. It might seem so natural now to see the expression $\partial x$, but that’s only because we’ve spent years working with these symbols, forging a connection between concepts and notation. Due to this, it can seem entirely natural to look at notation and instantly understand what it’s about as a concept, rather than just symbols. This is quite similar to our experience with foreign languages, where the words and characters look alien to us, yet our own languages seem so obvious.

I’ve been thinking about notation after working with some students who seemed to be struggling with certain concepts. I was wondering why they couldn’t just see the same things as I could on the paper. I know that some struggle with seeing the connection between $f(x)$ and $y$ in an equation, even though of course they mean the same thing in this setting. Another variation to this issue is when a text or a problem refers to multiple functions, and names them $f(x)$, $g(x)$, and $h(x)$. It might seem natural to us to have these be the names for arbitrary functions, but this sudden spring of notation onto students can be deeply confusing when they aren’t used to it. The consequence is that it seems random and without explanation, so then students start believing that part of mathematics is just like that: innately confusing.

In my mind, unless we’re talking about probability, mathematics should not seem random.

In fact, mathematics is all about investigating structure. To do this, however, we absolutely require definitions and notation. But, the thing that is often lost on students is that we create these definitions and notation! It’s there because they’ve (mostly) stood the test of time of being good for problems. As such, I think it’s critical that we get students on board with the notation, and to be capable of moving fluidly between notations and concepts.

For students in their secondary education, this means being familiar with the idea of an equation, and not being tied to the notation itself. As an explicit example, this means having students being aware of parabolic functions apart from the “It’s the one with $x^2$ in it!” I want students to be comfortable with saying that $d(t) = t^2$ is just as much of a parabola as $y = x^2$ is. The notation is important, but the specific symbols aren’t.

Here’s another example that I find really tests whether a student understands the concept of what a function is. If they’re given the function $f(x) = x + 2$ and are asked for the value of the function when $x=2$, many will write $f(x) = 2 + 2=4$. Of course, the function value is correct, but the problem is that this isn’t $f(x)$, it’s $f(2)$. This isn’t a huge deal, but it teases out the apparent weakness with the notation $f(x)$. From what I’ve seen with many secondary students, they don’t translate finding the value of a function to putting that value into the notation of $f(x)$. I’d argue that this implies they aren’t fully grasping what $f(x)$ means, but I also think it could simply be a lack of explanation. To remedy this, we need to put more emphasis on explaining the concepts behind the notation, so that the students will be on board with using it. If we don’t do this, we create more problems for ourself down the road when more complex notation comes along and students aren’t ready to fluidly jump from one set of notation to the next.

Overall, notation isn’t important in and of itself. But to do mathematics and learn new topics, it’s crucial to be able to understand what a certain notation means and how to use it.

The Importance of Factoring

When you’re trying to solve a simple algebraic expression like $ab = 5b$ for the variable $a$, it quickly becomes second-nature to divide both sides of the equation by $b$, yielding $a = 5$. This makes complete sense, and it’s what most people would do right off, without even thinking. I mean, look at both sides of that equation! If there’s a $b$ on both sides, then the other value on each side of the equation should be equal to each other, giving us $a = 5$.

But not so fast.

What if I were to tell you that this wasn’t the only solution? What if there was another solution to your equation that you missed?

To prove this, let’s look at the original equation again. We have $ab = 5b$, so let’s subtract $5b$ from both sides of the equation. Doing so gives us: As you can see, what we did during the second equality was factor out the $b$ from both terms, giving us a product of two terms that equals zero. Once we have that, we know how to solve a product of terms giving zero. At least one of those terms in the product must be equal to zero. Looking at this, we see that $a = 5$ is a solution, like we said before. However, there’s a second solution to this equation, which is $b=0$.

So what happened? How did we miss a solution when we first solved the problem?

The issue was that we divided the equation by $b$, but as we just saw, a solution to the equation is $b = 0$. This means that we were potentially dividing by zero! As most readers know, this is a big problem. We can’t divide an equation by zero, and so by doing this, we were in effect saying that $b \neq 0$. This meant that the solution we found was only valid when $b$ was not zero. As a result, we neglected to think about what happened to the equation when $b$ was zero, and so we lost a solution. By factoring the equation instead of dividing by zero, we can avoid losing the $b=0$ solution and get both in one go.


When we are working through a problem that involves algebra, we tend to push forward without necessarily thinking about the technicalities of what we are doing. Is there the same variable on both sides of the equation? Great, I can cancel them! It’s almost a reflexive habit, but it’s one we need to try and actively resist. By factoring instead of dividing, you create a product that equals zero, allowing you to be sure that you capture all of the solutions to a problem.

Of course, sometimes it’s fine to divide terms out of an equation. However, you need to make sure that you aren’t potentially dividing by zero.