The Distance Formula (Equation Explainer)

The distance formula is one of the most frequently used relations in physics, allowing us to decompose a variety of vectors into different components. It’s something that every physics student uses, and so it becomes second-nature for most of us. However, I’ve come across the sad fact that many secondary schools don’t seem to teach how the distance formula comes about and its connections with earlier work. As such, the link between algebra and geometry is lost and the distance formula gets lost in calculating differences between $x$ and $y$. Here, the goal is for us to look at the distance formula and see how it relates to other concepts that are of much use.


In secondary school, students are presented with a formula that is supposed to give the distance between two points on a graph:

In my mind, this is a bit of an intimidating formula. There are several moving parts (variables), and it looks like you have to keep everything in check in order to get the distance right. As such, I find that almost no student actually remembers this formula without consulting their memory aid. It’s finnicky, and it seems like one wrongly placed number will make everything go haywire1.

However, what is important here is that the formula isn’t very illuminating to the student. I wouldn’t say it’s necessarily because the formula is “ugly” (which is subjective), but that it isn’t explained. To see the formula properly, it’s critical that we don’t only look at the algebra, but that we also draw some sketches.


Let’s set up our problem. Given two points $A$ and $B$, we want to find the distance between these points when we draw a straight line between them. We can represent this as such:

In our sketch, we simply have the two points that are placed in a regular Cartesian plane. Now, we know that the equation for the distance between points $A$ and $B$ is what we saw above, but what many students don’t see is that this equation can be seen from the graph. It’s not a magical formula that one gets independent of the graph. The formula is intimately tied to geometry. To make this suggestive, let me draw a few extra lines on our graph:

So how does this help us? Well, one thing you might notice is that if we connect the two points by a straight line, the length of that line is what we want. Furthermore, when we take this line with the other two I have drawn, we get a right triangle.

If there’s one thing I’m hoping that you remember, it’s that we have a very special relation between the three sides of a right triangle: the Pythagorean theorem. This is the quintessential secondary school formula, and I bet you can recite this one from memory. It’s given by $a^2 + b^2 = c^2$, where $a$ and $b$ are the legs of the triangle and $c$ is the hypotenuse.

But what I want you to see is that it’s exactly this relation that gives us our distance formula! If you look again at the lines I drew in our graph, we just need to find the two legs in order to compute the hypotenuse, which is our distance $d$. It might look like we don’t have the information to do this, but if we inspect our points closer, you’ll find that we do.

Start with the green side. For simplicity, let’s suggestively call it $\Delta y$, since I think you would agree that we are looking at the change in the $y$ coordinate. The length for this leg of the triangle is simply how much the $y$ coordinate changes when we go from $A$ to $B$. Note here that when we talk about a length, our measurement is positive. This means that, even if point $B$ is lower than point $A$, we still say that the length is positive2.

We can apply a similar argument for what I’ll call $\Delta x$, and this will give us a picture that looks like this:

Finding the distance between $A$ and $B$ is now a piece of cake. Since we know the two legs, we can apply the Pythagorean theorem to get:

Solving for $d$ gives us:

The final thing to do is find an expression for $\Delta x$ and $\Delta y$. This isn’t as difficult as it may look, because one thing we do know about this problem is where $A$ and $B$ are (which means we know their coordinates). Therefore, if we let $A = (x_1, y_1)$ and $B = (x_2, y_2)$, we know that $\Delta x = x_2 - x_1$ and $\Delta y = y_2 - y_1$. Putting this into our expression for $d$ gives us:

This is exactly the expression you probably have somewhere on your memory aid, but now you know why it’s there. When we compute the distance between two points in the plane, what we are really doing is chopping that distance into a new path, one that gives us the legs of a right triangle. Then, using geometry we’re used to, we work backwards to find the original distance we are looking for. There’s no magic here, just geometry.

Three Dimensions

You might now be wondering what the distance formula would be in three dimensions. Well, I’m not going to go through a whole rigourous proof, but here’s the idea. Imagine we have the following situation where we want to find the distance between the origin $O$ and another point $A$ (I’m using the origin just for a bit of added simplicity).

Hopefully you can see that the distance now occupies three dimensions, which I’ve denoted as $x$, $y$, and $z$. The algorithm is then simple. First, we compute the distance that lies on the $x-y$ plane, as shown below (the solid orange line). Note that I’ve once again used $\Delta x$ and $\Delta y$ to denote the side lengths.

Now that we have the length between along the plane, we can now simply extend this to our third dimension using the Pythagorean theorem.

Putting these steps together, we can see that the distance between a point given by $A = (x_0, y_0,z_0)$ and the origin $O$ is given by:

This is then easily modified for any two points, by simplying replacing every squared term by the difference of the coordinates, such as $(x_1 - x_0)$.

Going Up in Abstraction

Now that we’ve covered our physical dimensions, we might start asking what seem like more difficult questions, such as:

What’s the distance between two points in 4 dimensions? What about $n$ dimensions?

This isn’t a distance you can visualize, of course (unless you’re very good at visualization). We are limited by three dimensions, but the good news is that the mathematics doesn’t care if we work in one, two, three, or $n$ dimensions. It turns out that we can just continue applying the procedure that I outlined above for any number of dimensions you are insterested in. Therefore, we can write the distance between two points given by $A = (x_1, x_2, x_3, … , x_n)$ and $B = (y_1, y_2, y_3, \ldots , y_n)$ is:

In other words, just keep on applying the difference between the two points and squaring the result to get the distance in any dimension.

So why does this work? Or rather, did I just get lucky in the way that I drew my graph that the lines came out to be a right triangle?

The answer is no. I didn’t choose the lines randomly. If you look back at the graph, the lines I drew in are parallel to to the axes. What this does is guarantee two things. First, it ensures that we can easily calculate the distance along either the horizontal ($\Delta x$) line or the vertical ($\Delta y$) line. The second thing is that, by drawing our lines parallel to the $x$ and $y$ axes, the two lines are perpendicular (orthogonal), which means the triangle we get is a right triangle, just what we need to use the Pythagorean theorem.

I hope this makes the distance formula seem a bit more clear. The formula comes directly from the geometry of the situation, which I have demonstrated here. As such, the only thing to remember is that we need to use the Pythagorean theorem to find the distance between two points. After that, everything is a piece of cake.

  1. To assuage this fear a bit, I want to point out a property of a term that looks like $(a-b)^2$. We know that the only difference between $a-b$ and $b-a$ is that one will be negative and one will be positive (we’re assuming they aren’t the same number), but the absolute value is the same. Now, if we square this term, the result is always positive (a property of squaring a number). Therefore, the lesson I want to impart here is that the order of $x_{1}$ and $x_{2}$ don’t matter, since we are squaring the result. 

  2. So why are lengths always positive, even though we know our $\Delta y$ or $\Delta x$ may be negative? I explained to you the “physical” way of thinking about it (I often think in the way of my background in physics), but this is also encoded mathematically in our distance formula. When I presented these two “extra” lengths in our graph, you agreed with me that they were $\Delta x$ and $\Delta y$. However, if you think about it, each of those line segments should be able to be computed using our distance formula, and indeed they can. If we take $\Delta x$ as an example, we know that $y$ does not change as we move along this line, so the distance formula gives $d = \sqrt{(\Delta x)^2} = \Delta x $. Note here that it doesn’t matter if $\Delta x$ is negative, since you have to first square it, and then take the square root. This has the pleasant effect as of taking care of the issue with signs, so you don’t have to worry, and it matches our physical intuition about lengths being positive. 

Examples Before Abstraction

When learning a new topic, there’s always a certain tension between two approaches: going straight to abstraction, or starting off easier with examples. I see this more and more as I learn about more complex and detailed physics and mathematics, and it has always made me wonder which way I should go about trying to learn. Just like anyone else, I want to get to a place where I feel fully comfortable with the concept in abstraction, but I don’t want to subject myself to a painful learning process by hitting myself against the brick wall of abstraction.

For a feeling of what this is like, do the following. Pick a topic that you feel comfortable with, and then look up that topic in some other resource. Chances are (particularly if you’re looking at a mathematics book), the equations and explanations you will see will look completely confusing. This has happened many times to me, and it’s humbling every time I go through this exercise. However, the simple reality is that it’s discouraging, since a topic you thought you knew has a bunch of other facets that you didn’t know about. Of course, this may be an incentive to learn more, but I know that, at least for me, it would make me want to stop.

The other reason I’ve been thinking about this is because I work with a lot of students with classes I’ve already done. When I work with them, I’m often tempted to give them answers that are more abstract than they are used to. In my mind, the idea was to encourage them to see the concepts a bit more abstractly. However, upon reflection for a while now, I’ve been worried that perhaps the abstraction was too much at their stage. It’s not that they couldn’t see the abstraction, but simply that they were already working hard to understand the concepts, so piling on more wasn’t helpful. I see that now in a way that I didn’t quite see before. It has come through my own self-study, where I’ve found it’s not always helpful to give the full abstraction or generalization before concretely looking at examples.

In order to illustrate this, consider the mathematical object called a tensor. The way I like to think of tensors is that they are generalizations of familiar objects like scalars, vectors, and matrices. I personally use tensors in my research within general relativity, but the point I want to make is that one way we can define a tensor is through their transformation properties. In general relativity (and in other applications of differential geometry), we like to work with the tools of calculus. However, since you may have heard that spacetime is curved, it’s not so simple to ask what a “regular” derivative is. In order to make up for this curvature, we have a new notion for derivatives, called the covariant derivative. It’s given by (for a tensor of rank $k + l$):

The point with this long equation isn’t for you to understand it. Heck, it’s still difficult for me to follow. However, what I think we can easily agree on is that this is not the first equation you want to show someone who is learning about tensors for the first time. Sure, it’s a general equation that applies to most situations, but the drawback is that it has a lot going on. This is a repeating pattern that I see a lot in general relativity. The equations are very long, which means they are difficult to analyze and it isn’t always easy to understand what they mean. Therefore, I would argue this is not suited to the beginner. Instead, in this particular case, I would make sure the student understands the two fundamental transformations that play a role here: the upper indices transformation, and the lower indices one. Putting these together gives us the equation above, but it also gives the student a sense of what is going on. Simply giving the student this above equation isn’t helpful on its own.

This is only one personal example, but like I’ve also mentioned, I’ve seen this kind of situation happen with the sutdents I work with. As such, I’m trying to constantly remind myself that the material itself can be challenging to learn, and so I shouldn’t make things too abstract before they are ready. I’m sure there are some who thrive by looking directly at the most abstract and general concepts, but I know that many people aren’t like that. Furthermore, I experience this exact scenario myself during my self-studying, so I shouldn’t try and bring the level of abstraction up too quickly.

After all, the next rung in the ladder of abstraction will always be there for the student to climb.

Contradiction as Zeros

This is going to be a very quick post, but it’s something I wanted to share since I think it could give some insight into a concept that is used to make proofs in mathematics. When writing proofs, it is often difficult to show that your proof holds for every case (say, if you’re trying to show that the square root of two is not capable of being represented by a ratio of integers). Checking every single combination of integer ratios would take an infinite amount of time, so we want to come up with a strategy that is better. To do this, we try to prove our statement by contradiction. The idea is that we negate our conclusion, and from that, we need to show that there is a logical contradiction. Therefore, when we come to a contradiction, we know that our assumption of negating the conclusion was false, so that means our conclusion is true.

But why does a contradiction imply only that our conclusion is actually true? Why couldn’t it mean that something else is wrong with our proof? To see this, I like to think of it as finding the zeros of a function.

Remember that, if we have a series of terms that are multiplying each other and equal to zero, then this implies that at least one of these terms is zero. This comes from the fact that you can only get zero as a result if one of your factors is also zero. Now suppose that you know that some of these terms are never zero. Then, you can shorten your list of potential candidates that are zero, letting you solve the equation.

This closely resembles what we do in a proof by contradiction. In this case, the “terms” are our hypotheses, the things we assume are true. When we construct our proof, we always choose hypotheses that we assume to be true, so they won’t ever be false. We then add one more statement, which is the negation of our conclusion. The idea here is that we want this to be false (because we want the conclusion, and not its negation, to be true). From here, we simply use these hypotheses to generate new statements (which is the equivalent of multiplying in our analogy), and we finally come to a result. If we then see that our result is a contradiction, we know that at least one of our statements that we used is false. However, since the only statement we aren’t assuming to be true is the negation of our conclusion, we can say that the satement is false, which proves our conclusion.

This connection between finding zeros and doing a proof by contradiction is helpful to me since it shows me just how we know that a particular statement has to be false. It all comes down to the fact that we deliberately construct the group of statements such that only one will be false.

Conceptual Understanding

As a student, I know what it takes to get good grades. Essentially, you want to be able to reproduce the work that is taught in class during a test. You don’t need to be creative or original in your work. Rather, you simply need to understand the procedures and apply them (for the most part).

This is rather straightforward. After all, if you’ve worked through the homework in your class and have studied the material, it’s not too difficult to do fairly well in a given class. Questions become variations on a theme, so getting good grades is almost algorithmic.

However, one type of question in my classes that is more tricky to answer is the conceptual class. This means that the question requires some sort of explanation and reasoning, rather than a calculation. It might not seem like it, but this is by far the more difficult type of question, since it is so ambiguous. There are the usual issues of not knowing if you’re explained enough, but the real difficulty is that you can’t go through an equation to necessarily give you an answer. That’s why I (and I assume many others) dread conceptual questions.

Additionally, it’s simply not easy to conceptually understand a topic in mathematics or physics, instead of being able to reproduce it. Just because I can calculate the change in entropy for a certain physical situation does not mean I can explain why the entropy increases or decreases in that situation. In other words, I can reproduce the calculation, but I might not be able to really explain it.

You might guess that this makes me nervous. Indeed, when my goal is to get good grades in a class, conceptual questions are not what I usually want to see in a test. They are tricky and less straightforward than calculations, which means I will tend to make mistakes more frequently. As such, I try to avoid this type of question.

I think that you can also guess that this isn’t a good thing. If you’ve read anything from me, I’m sure you’ve gotten the impression that the one thing I want people to have is the conceptual understanding instead of only the computational ability. Of course, you’d be right, and that’s what I want to talk about today.

In my mind, conceptual understanding is critical1, but the problem in school is one of alignment. The reward systems in school don’t favour trying to ask conceptual questions, because they punish creative thinking in favour of being “right”. However, if the students never get to test their common sense and intuition about various subjects, why should we expect them to do well on a test with these kinds of questions?

One thing that I think everyone can agree on is that having a misconception about a subject is something we want to avoid. Put another way, we don’t want students to go through a subject with an incorrect view of a phenomenon, and then proceed to carry this incorrect mental picture with them for years later. Any teacher will tell you that they don’t want this to happen to their students. But the irony is that we do allow this by simply not asking enough conceptual questions to students!

The educational model in physics in particular isn’t set up for this kind of question. As such, we seem to ask fewer conceptual questions because they aren’t easily graded and take up time. The cost is that we let students make their own conclusions regarding the phenomena they learn about, and I am certain that students don’t get it right 100% of the time. Judging from my experience, it’s not even close. Consequently, we end up going through a course thinking that we know enough about the subject, only to be stumped by a conceptual question that either has us scratching our heads or confidently saying something that is incorrect.

The solution is obviously to tackle more conceptual questions when you are learning, but this isn’t as easy as it seems. While I think this is the answer, it’s not a practical suggestion at present, since it punishes students unfairly in terms of grades for attempting a question and being incorrect in their formulation. In my mind, this isn’t something that one needs to be tested on. Instead, conceptual understanding comes from years of engaging with the topic, but this lesson isn’t being taught when students have the mentality of “remember for the test, and then forget”. I know many of my friends who go through school with this mentality, and it’s something we need to work to discourage. Instead, I think conceptual questions need to be asked more, but not necessarily graded on. I’m thinking of a weekly question that gets students thinking about a topic more deeply. Personally, I will be trying to do this with my own learning. If I come across a conceptual question I cannot answer, I will make sure I find the answer so that I can explain it easily.

I think physics is one of the few subjects where you can really dance between the rigour of mathematics and the simple explanations of intuition. As such, I think it’s useful to not be married to the former approach only, and to be able to explain topics without simply resorting to the mathematics. I think you know that I’m by no means against using mathematics, but doing the computation can sometimes evade the more difficult part of explaining2.

  1. Indeed, if you only want computational ability, than I would suggest we teach everyone how to program calculations into a computer so that we don’t have to keep on doing them by hand. 

  2. Of course, this same idea can be applied to mathematics, though everything gets a bit more abstract. In mathematics, it’s the difference between saying you know something and can prove it, versus merely being able to compute it.