So if there was a simple way of checking that the ISS is actually flying at an altitude of 400 km, it would be a hard blow to all kinds of flat Earth claims. Luckily for us, a fairly simple method exists. I suggested it a few years ago to a flat Earther, and eventually used it in practice in 2020. Here is how I did it.

The idea is very simple. You just need to make an observation of an ISS pass from two or more places simultaneously. The ISS will be visible in slightly different direction in each of the places (this is essentially just parallax). Knowing the positions of the observers and the directions from them to the ISS, it is possible to determine the position of the space station.

OK, sounds cool, but how do you measure a direction, and what does it even mean to "measure a direction"? There are multiple ways, but in this case the simplest one is to measure two angles: azimuth and elevation (angular altitude above horizontal). The azimuth is a horizontal angle between the direction to the north and the point on the horizon directly below the observed object. Conventionally, east is expressed as an azimuth of 90°, south is 180°, west is 270°, and other numbers denote directions in between. The elevation, on the other hand, is the angle between the horizontal plane and the direction to the object. Measuring these two values will be enough to establish the exact direction to the ISS.

How to measure these angles? Traditionally, azimuths can be measured eg. with a compass, and elevations with a sextant, but... if you have ever seen an ISS pass, then you know that the ISS moves across the sky rather quickly. It means that it would be quite hard to measure the desired angles at some precise moment. We need a better way.

Since the main problem is the motion of the ISS, things would become much simpler if we could eliminate it somehow. This can be done very easily - by just taking a photo! The station will be motionless in the picture, we just need a way of reading the azimuth and elevation from it.

And how do we do that? We need some reference points. There are some very convenient reference points in the sky: the stars. We could measure the positions of the stars with a compass and a sextant, but we can also take a shortcut. Since these positions are very well known and it is common knowledge what the elevation and azimuths of specific stars are at any given moment (to the point that makes celestial navigation possible), we can just use some kind of an almanac or something equivalent, like Stellarium. That will tell us what were the azimuths and elevations of selected stars at the moment of taking the photo of the ISS.

The last issue to overcome is that in order for the stars to become visible in a photo of the ISS, a long exposure time is needed. That will make the station move a significant distance across the sky while the photo is being taken, which means that it won't be a single point in the picture, but a line. This is not a big problem, though: if we only know the times when the exposure started and finished, we can get the positions of the ISS at those moments just by looking at the ends of the line.

So now we have a way of getting the azimuth and elevation of the ISS as seen from some position, based on a photo. These two angles together with the observer's position determine some line in space - a line of possible positions of the ISS! Knowing more than one such line, we can find where they intersect, which will be the actual position of the ISS in space. Doing this for the points at both the start and the end of the exposure, we will get the station's positions at two distinct moments in time, which will allow us to calculate its speed, too. All that's left now is to actually do it :)

I made a few observations together with two friends, I'll focus on one of them here as an example. We took the photos below on May 16th, 2020, 22:45:25 UTC (May 17th 0:45:25 Polish time) with 30 seconds exposures - me in Katowice, one friend near Łódź, and one friend near Mielec, all places in Poland.

For starters, I'll show you how to get the azimuth and elevation of the start and end of the ISS trail based on the photo from Katowice.

The first step is to identify some stars. Below you'll find the photo with some brighter stars and constellations labelled:

I chose five of those stars for further processing: Cor Caroli, Megrez, Merak, Eltanin and Polaris. It's completely possible to use more stars - 5 are just a reasonable minimum, and I didn't want to spend too much time on the analysis, but generally the more, the better. I chose these five because they are fairly widely spread in the frame and clearly visible.

The next step is to write down the coordinates of the chosen stars. We need two sets of coordinates: x and y from the photo (you can find them using a graphical program like GIMP), and azimuth and elevation in reality. I got the last two from Stellarium:

The data from this photo looks like this:

# Merak 1073 2655 306.3924 49.7092 # Megrez 1372 1930 302.7408 59.5692 # Eltanin 4411 121 73.4577 68.8078 # Cor Caroli 200 630 263.5183 58.6198 # Polaris 3589 2949 0.1895 49.6069

Each row is x and y of the star in the photo, and then azimuth and elevation.

To get the azimuth and elevation of the ISS, we also need the x and y of the positions of the ISS in the photo - in this case, these are the following:

2346 2086 4300 1441

I processed the other two photos in the same fashion. After performing these steps we have all the necessary data, the only thing that's left is to do the math!

The first thing we need to do is to find the field of view of the camera. This will let us translate the pixels in the image into angles in reality.

In order to do that, we will start with a model of how photos are created. We will assume a coordinate system with the axis along the optical axis of the camera, the axis representing the left-right direction relative to the camera, and the axis along the up-down direction, again relative to the camera. "Relative to the camera" means that if we, for example, laid the camera on its side, then the axis would be *de facto* vertical, and the axis would be horizontal.

We will assume that a point in the photo corresponds to a direction in space represented by a vector in the coordinates introduced above , where:

with meaning the width of the photo in pixels, - the height of the photo (also in pixels), and - a value related to the horizontal field of view of the camera. Specifically, since our assumptions mean that the point in the image is exactly on the optical axis of the camera, and the point is at the edge of the frame, we have:

where is the camera's horizontal field of view expressed as an angle.

is a value we want to determine. How do we do that?

Let's assume two stars in the photo and denote their azimuths and elevations respectively for the first star, and for the second star. Azimuth and elevation of correspond to the following vector in space:

in coordinates in which the axis is directed to the north, the axis is to the east, and the axis is upwards.

Based on such vectors corresponding to two stars, we can calculate their angular distance in the sky, which we will denote :

On the other hand, if we assume some value of , we can calculate the angular distance between the stars based on the photo:

where:

Hence:

We can calculate such distances - real ones, and ones based on the photo - for multiple pairs of stars and then check how much the distances based on the photo (with some assumed ) differ from the real ones. Then we can modify accordingly, check the difference again, and repeat the process until the distances calculated from the photo will be as close as possible to the real ones. This process is best done using a computer, which is an ideal tool for performing lots of repetitive calculations - which is what I did, using the function scipy.optimize.least_squares from the SciPy library for the Python language.

Performing the procedure described above on the data written down from my photo yields the following result:

z_fov: 3946.5177505328793

which, with the image width being 6000 pixels, corresponds to a horizontal field of view of approx. 74.5°. I took the photo using a lens with a 35 mm equivalent focal length of 24 mm, which gives a theoretical value for the horizontal field of view of 73.7°.

The next step is to calculate the camera's orientation in space. We will express the orientation using Euler angles in a coordinate system in which the axis is directed to the north, the axis is upwards, and the axis is to the east. You may have noticed that this system resembles the one we used earlier for calculating the angular distances between stars, but with reordered axes. Why did I choose such a system? The reason is simple: this system is oriented in such a way that when the camera's axes are parallel to the axes in this system, we have a camera that is perfectly level and looking to the north. It will be convenient to have such an orientation correspond to angles of , and any other orientation to correspond to nonzero Euler angles.

But I'm getting a bit ahead of myself. Let's maybe start with a few words about Euler angles. As we can read on Wikipedia, it is a set of 3 angles describing an orientation of a rigid body, or a coordinate system, relative to another coordinate system. In short: these angles tell us how we have to rotate a coordinate system around some 3 specific axes in order to have the axes of this system become parallel to the axes of another system.

In our case, we can imagine these angles this way:

- We start with a camera that is level and looking to the north.
- We rotate the camera by angle around the vertical axis. The camera is still level, but is now looking in a different direction - specifically, in a direction with an azimuth equal to .
- We rotate the camera by angle around its left-right axis. This means that the camera's horizontal axis will still be horizontal, but the camera is no longer looking horizontally, but towards a point with an elevation of .
- We rotate the camera by angle around its optical axis. Now the camera is still looking at the same point, with an azimuth of and elevation of , but its left-right axis might no longer be horizontal.

By choosing the right values of these 3 angles, we can express any orientation of the camera in space. What's left is to calculate what the angles were in the case of my camera when I was taking my photo.

The method I used is identical to the one I described for calculating the field of view. I assume some values for the Euler angles and check whether the calculated positions of the stars in the photo are the same as their actual positions. If not, I'm modifying the angles until I get as close as possible to the correct result - again with the help of scipy.optimize.least_squares.

Specifically, the maths looks like this - for every star in the photo, I'm performing the following steps:

- Calculate the vectors and (which are vectors corresponding to the direction to the star relative to the camera, and to the direction in reality based on the star's azimuth and elevation, in the coordinate system described earlier). In addition, normalize the vector (which just means dividing it by its magnitude, so that the magnitude of the result is 1 - the vector has a magnitude of 1 from the start).
- Transform the vector according to the Euler angles, denote the result by .
- Calculate the error for this star, equal to

The sum of all the errors for all the stars is then taken to be a measure of the error in fitting the angles. The program then aims to minimize this error, which makes it find the right set of Euler angles.

After applying this to my photo, we get:

Found Euler angles: Alpha: -9.546900747272263 Beta: -64.3326126654504 Gamma: -7.645562378256375

It would mean that the camera looked almost directly to the north (sounds about right, you can see that in the position of Polaris), was almost level (low angle, also sounds right) and was looking at an angle of roughly 64° upwards (which also sounds right - although note that we obviously have the sign of the angle reversed, but that's just due to some mistake in defining the axes).

The next step is to find the azimuth and elevation of the ISS. This step is fairly simple at this point - having and the Euler angles, we can just take the ISS coordinates from the photo, convert into a vector , normalize, transform using the Euler angles and calculate the azimuth and elevation from the resulting coordinates like so:

To be more specific, in code it is best to use another function for azimuth, commonly called "atan2". Regular inverse tangent (atan, *arcus tangens*) can only have a value between -90° and 90°, which doesn't cover the full 360° range of azimuths. "atan2" is a two-argument inverse tangent which takes into account the signs of the numerator and denominator, and thanks to this it can give a value between -180° and 180°.

Performing these calculations on the data for the starting point of the ISS trail from my photo yields the result:

Azimuth and elevation of the ISS: 329.3345431204687 62.77086064445405

According to Stellarium, the azimuth and elevation of the ISS at that moment should have been 329.3554 and 62.7508. I'd say that's a fairly satisfying accuracy :)

Having the azimuths and elevations of the starting and ending points of the ISS trail in the photo, we now just need to find the point in space where the lines of sight of different observers intersect.

However, we need to take into account that the azimuths and elevations are directly related to local directions like vertical or north. Specifically, if we express the direction as a vector, then the azimuth and elevation correspond to the following one:

The vectors , and are some vectors corresponding to local directions to the north, east and upwards. Their exact representations in a coordinate system can depend on the system we use (we actually had an opportunity to see that before, when we used a slightly different system for finding the Euler angles than the one for angular distances between the stars).

In order to find the point of intersection of the lines of sight determined by each photo, we need to express all the positions and directions in a single global coordinate system.

This is where one more complication arises - for the purposes of arguing with flat Earthers, I will want to perform the calculations for both flat and globe Earth. This means that I will have to introduce two separate coordinate systems.

Let's start with the case of flat Earth, because it's the simpler one. We will assume that the Earth is a disk in the plane with the axis passing through Greenwich, and the North Pole being at the point . We will assume the axis to be perpendicular to the disk. Moreover, we will assume that latitude corresponds to the distance from the pole to the given point. Specifically, latitude and longitude will correspond to the following coordinates in space:

(The coordinates are expressed in kilometers, and we assume that the observer is at sea level - which introduces some inaccuracy, but a small one, in reality all of us were less than 300 meters above sea level.)

The direction vectors will then look the following way:

For a globe Earth, we will introduce coordinates in which is the Earth's center, the axis is passing through the poles, and the axis passes through the intersection of the equator with the Greenwich meridian. Assuming , we can express the position of the observer in the following way:

The directions will look like this:

We finally reach the stage where we fit the position of the ISS. "Fit", because the method will be similar to the one for the field of view and Euler angles - we will choose some coordinates , calculate some measure of error, and then we'll tell the computer to find the coordinates that minimize the error.

For the measure of error, we will choose the quadratic average of the distances of the position from the lines of sight determined from the photos. Let me explain.

Having a line defined by some point and direction (we also assume that the magnitude of is 1), we can calculate the distance of point from this line the following way:

- Calculate the vector from to : .
- Find the projection of onto , which is:
- The difference between and its projection on is the component of that is perpendicular to . This component is a vector perpendicular to our line, from some point on the line to point - the magnitude of this vector is the distance we're looking for.

So, the distance of the point from the line is equal to:

In our case, point is the assumed position of the ISS, point is the observer's location, and vector is the vector determined by the azimuth and elevation of the ISS and the position of the observer in the global coordinates.

For the -th observer, we calculate : the distance of the ISS from the line determined by that observer's photo, and then we calculate the average distance of the ISS position from the lines:

We consider ISS to be at the position that minimizes this average. If all the lines intersect at a single point, this average will then be 0 - but since our measurements aren't perfectly accurate, chances of this happening are rather low.

By the way, note that I didn't use the regular, arithmetic mean here, but something called the "quadratic mean". Why?

Assume we have two lines and that the distance between them at the closest point is, say, 6 km. If we take a point on that 6 km line segment, between the two lines, it will be at a distance of from one of the lines, and from the other one. The "normal" mean of these two distances is then . It doesn't matter which point between the lines we choose, the average distance will be 3.

It is different with the quadratic mean. This mean will be . This number depends on and is smallest when is 3. In other words, if we aim to minimize this mean, we will find the point precisely halfway between the two lines. If we used the arithmetic mean, we could get any point on the 6 km line segment between the two lines.

The last piece of information we can extract from the photos is the speed of the ISS - specifically, the approximate average speed during the 30 seconds of exposure. How to find it? Just calculate the position of the ISS at the beginning of the exposure, then the position at the end, calculate the distance between the two positions and divide by 30 seconds. Done.

Let's finally take a look at the results!

For the 3 photos presented above, the result looks like this:

Globe Earth: Average distance from lines of sight: 8.02 km 2020-05-16 22:45:25: h = 410.98 km, lat = 51.7392, lon = 17.5184 Average distance from lines of sight: 3.38 km 2020-05-16 22:45:55: h = 422.20 km, lat = 51.7000, lon = 20.7150 Speed: 7829.6 m/s Flat Earth: Average distance from lines of sight: 5.72 km 2020-05-16 22:45:25: h = 390.90 km, lat = 51.7581, lon = 17.6299 Average distance from lines of sight: 3.55 km 2020-05-16 22:45:55: h = 407.19 km, lat = 51.7310, lon = 20.6846 Speed: 7573.2 m/s

The average distances listed here are the measures of error I described above.

As you can see, regardless whether we assume the Earth to be spherical or flat, the result is that the ISS - whatever it is - moves at an altitude of approximately 400 km with a speed exceeding 7 km/s. There are no balloons or drones that can do that. The only reasonable explanation is simple: ISS is in fact a space station, orbiting the spherical Earth.

I need to point out one more thing here: somebody will definitely notice that the average distance from lines of sight is smaller for flat Earth than for the globe, at least for the starting position. One could be tempted to consider this evidence for the flatness of the Earth, but I must disappoint those who were about to claim this: the difference here is much too small to mean anything. The uncertainty of the directions calculated from the photos alone most likely causes differences larger than these few kilometers. If the difference was large, like a few km in one case, and a few tens of km in the other, then it would be something to be discussed. But with these results, any conclusions based on that would be premature.

By the way, it might be possible to use the observations of the ISS to distinguish between a spherical Earth and a flat Earth with data collected from a larger area. We were observing the ISS from a triangle with sides of 170-200 km, when it is visible at any moment from a circle with a radius of roughly 1400 km. If more people were involved, and were taking observations of the ISS from places a few hundred km from each other, such that some would be seeing it near the zenith, and some near the horizon - it could turn out that one of the shapes of the Earth fits the data much better than the other one.

Measuring the position of the ISS turned out to be great fun, and the accuracy of the results exceeded my expectations :)

We can repeat these observations. If you're interested in doing that, come to my Discord server, where we can coordinate the undertaking (but beware: there are mostly people arguing with flat Earthers on YouTube there, so if you come to the server, don't be surprised ;) ). The more people are photographing the ISS simultaneously, the better the result, so feel invited to help :)

I suspect that flat Earthers will remain unconvinced, despite fairly conclusive results about the nature of the ISS - but that's not what this was about. It was about good fun and seeing what kind of results can be obtained, and these goals were fully accomplished :)

]]>One of the consequences of relativity is that faster moving objects are harder to accelerate, which means that their inertia increases. And since it is being said from the beginning of the physics lessons that mass is the measure of inertia, it is tempting to try to explain this effect with an increase in mass. So, the notion of mass is being split into "rest mass" - the mass an object has at rest - and a "relativistic mass" - the mass of the object in motion, larger than the rest mass. The equations also become prettier right away, since if we denote the relativistic mass by , we can always write , and momentum can be expressed using the formula known from classical physics (versions with the rest mass also have an ugly square root in the denominator - we'll see it later). This is the life!

If you are following articles or discussions about relativity on the internet, you probably noticed relativistic mass being mentioned in multiple contexts. It is often used to explain the impossibility of reaching the speed of light ("because the mass would grow to infinity"), or sometimes someone will ask whether an object can become a black hole by going fast enough (it can't). The relativistic increase in mass is being treated as fact in such situations, as something certain.

Well, I'd like to disturb this state of affairs slightly with this article ;) Because, as it turns out, the notion of relativistic mass loses a lot of its appeal upon closer scrutiny. As a result, relativistic mass is rarely being used in academia and you can encounter it pretty much only at school, in discussions on the internet and in popular science publications. Let's take a closer look at the reasons behind that.

Wait a minute - I just said that inertia of objects in motion increases, and mass is the measure of inertia. And now I'm saying that the mass doesn't increase? What is all this about?

In pre-relativistic physics, it was simple. If you applied a force to an object, it would accelerate - and the ratio of the force to the acceleration was precisely mass. So the harder an object is to accelerate - which means, the smaller the acceleration caused by the same force is - the larger its mass.This can be expressed with an equation as .

What's more, an object would always accelerate in the same direction in which the force was acting (which might seem obvious... but let's not get ahead of ourselves). This means that the relationship above can be written using vectors and it will still hold: . This way we took into account that forces can act in various directions and we can still calculate the acceleration.

So what does relativity change in this picture?

Let's keep the distinction between the rest mass and relativistic mass for now. An object at rest has mass . According to the idea of relativistic mass, it increases in motion to:

This ugly square root in the denominator is usually denoted by , which lets us express the relativistic mass as:

When the velocity is 0, and relativistic mass is simply equal to the rest mass.

That's cool. We introduced the relativistic mass because we wanted to account for the increase in inertia by the increase in mass. So, can we still write ?

Well... almost. As it turns out, we can - but only if the force is *perpendicular* to the direction of motion! If it is parallel, the result is different - then we need to calculate the acceleration like this: (I'll derive this below for the curious).

Wait... what does it even mean? Well, it means that the object has larger inertia in the direction of motion than perpendicular to it. So if we want to find the acceleration of the object when a force is applied, we can't just divide it by the object's mass - we have to decompose the force into parallel and perpendicular components, divide them by different numbers and then add the results to get a single acceleration. In effect it is possible that when the force acts at an angle to the velocity, the acceleration might not even be in the direction of the force!

The conclusion is that the "relativistic mass" doesn't solve the problem of inertia. There isn't even a single number that would reasonably measure inertia, because now it depends on the direction! (It is still possible to introduce a quantity measuring inertia - only now it has to be a tensor of rank 2, which can be written as a matrix.)

That's the first point against the relativistic mass.

One of the ways of comparing masses of different objects is making them collide. If a more massive object hits a less massive one, then the former will only slow down slightly, and the latter will get bounced at a significant speed. On the contrary, if a less massive object hits a more massive one, the former bounces away, and the latter picks up only a small amount of speed.

In a special case in which an object hits another one that is at rest and has the same mass - the former will stop, and the latter will fly away with the same speed the former had initially. Newton's cradle is one of the better known illustrations of this fact.

The exact formulae for the velocities after the collision depending on the velocities before the collision and the masses of objects can be derived using the conservation laws for energy and momentum. We won't be doing this here, we'll just use the intuitive understanding described above.

The question now is this: what happens if the two objects colliding have the same rest masses, but vastly different relativistic masses? For example, what if one ball with a mass of 1 gram is at rest, and a second identical one going fast enough to have a relativistic mass of 100 grams (so ) hits it? Will they behave like objects with different masses (the moving one will slow down, and the resting one will start moving with some speed), or like equals (the moving one will stop, and the resting one will start moving with the same speed the moving one had before)?

The answer lies again within the conservation laws, only this time the relativistic ones have to be used. I'll write the full equations below, and here I'll just tell you the result: as it turns out, the balls will behave *like equals.* So, relativistic mass is irrelevant in collisions - the only mass that counts is the rest mass.

That's another point against the relativistic mass.

This isn't a physical argument, strictly speaking, more like a technical one, but it still carries some weight.

Physicists like to simplify their lives. One of the simplifications they like to make is eliminating the constants of nature from the equations. Let me explain.

Let's take the speed of light as an example. We can check in some books or on the internet that it is equal to 299 792 458 m/s. The number is rather ugly, but we have no say in what the nature chose as the speed of electromagnetic waves... or do we?

This particular number only comes from the units we chose. If we wanted to express the speed of light in feet per second, the number would be different. If we chose furlongs per fortnight, we'd get another different number. Hmm... What if we made our lives simpler and chose units such that the number was somewhat simpler? For example, if it was... just 1?

We can do that and that's exactly what physicists do. Example units like that could be a second and a light-second. Or a year and a light-year. Or any time unit and the distance light travels during that time.If we choose such a system of units, we'll have . And just like that, disappears from all the equations, because whether we multiply or divide by 1, it doesn't ever change anything.

(Physicists like to take it a step further and eliminate more constants. The units in which - so ones in which the speed of light, the gravitational constant and the reduced Planck constant are all 1 - are a popular choice. These units are called "natural units" or... "Planck units". The basic units in this system are Planck length, Planck time and Planck mass.)

Alright, but why am I mentioning all this? Well, let's see what happens to the famous Einstein formula for energy in such a system:

When we introduce units in which , we get:

The relativistic mass is always equal to energy in such units! So it is a *de facto* duplicate of the notion of energy. Everywhere where we would use the relativistic mass before, we can just substitute in energy (in other units: ) and nothing will change. Why would we need such an additional notion, then?

And that's yet another point against the relativistic mass.

As you can see, introducing the notion of relativistic mass doesn't really get us much. It isn't great for measuring inertia, it's useless in collisions, and is *de facto* a duplicate of energy. For these reasons, physicists pretty much stipped using this notion - now, when mass is being mentioned, it almost always means rest mass.

And because of that, I'm asking you, dear Readers - let's stop saying that mass increases for objects in motion. Let's stop saying that it becomes infinite at the speed of light. We can go ahead and substitute "inertia" for "mass" in these contexts, or - since it is a notion almost equivalent to relativistic mass - just use "energy". Both inertia and energy tend to infinity as speed tends to . Let mass remain a property that is constant for a given object.

Two key equations we will need are the relativistic expressions for energy and momentum:

is the rest mass in both of these equations - we forget that relativistic mass is even a thing now.

And, as always:

It will be our goal to express the acceleration using the force, mass and velocity.

We'll assume the following equation (which is correct in non-relativistic physics, too) as the definition of the force:

We have the formula for momentum, so we just need to start differentiating ;) We get:

The second term strongly resembles the force known from Newtonian physics (if we used the relativistic mass), but there is still the first term. Let's focus just on the derivative of for now:

We'll notice now that is simply , which is the dot product of the velocity vector with itself. The dot product behaves just like a regular product with respect to differentials, so we get:

Hence:

(Let's remember this form of the equation, we'll come back to it in a moment.)

Now, let us take a note of a property of the dot product: namely that , where and are the magnitudes of the respective vectors, and is the angle between them. In particular, if the vectors are perpendicular, the dot product is zero, and if they are parallel, it is equal to the product of the magnitudes (up to the sign).

Thus, our can be written as , where is the component of the acceleration that is parallel to the velocity, or, to be more precise - the projection of acceleration onto the direction of velocity (which will be negative if the angle between acceleration and velocity exceeded 90 degrees). The perpendicular component has no effect here.

On the other hand, can be written as - where is a unit vector (a vector of magnitude 1) with the direction and sense the same as the velocity.

Thanks to these "tricks", we can write the first term as:

Now let's note that is a vector that has the same direction as the velocity, a sense dependent on the sign of the projection of acceleration on the direction of the velocity, and a magnitude equal to that projection - so it's simply the parallel component of the acceleration ! Let's also decompose in the second term in the force into and we'll get:

Now the only thing that's left is to simplify :

Hence the final equation:

Thus:

Coming back for a moment to the equation I told you to remember - I mean the form which still contained . As it turns out, such a term can be written as:

where is a 3x3 matrix with components given by:

The whole equation can then be written as:

where is the identity (unit) matrix. This form expresses the force as the acceleration multiplied by a 3x3 matrix - it is this matrix I meant as a possible measure of the object's inertia.

Let us consider a collision of two objects with rest masses equal to , of which one is moving with velocity , and the other one is at rest. The object at rest has energy and momentum given by:

and the moving object:

The total energy and momentum of the system are then:

Let's assume that the objects will be moving with velocities and after the collision. The total energy and momentum will then be:

Due to the conservation of energy and momentum, these have to be exactly equal to the values from before the collision. This gives us two equations for two unknowns , .

There is one obvious solution, and that is , - this is simply the situation from before the collision. But thanks to the symmetry of the problem, there is a different, equally obvious solution: , . This second solution then has to correspond to the situation after the collision (it can be proven that there are no other solutions) - so the objects will behave such that the moving one will stop, and the other one will start moving with the same speed the first one had before the collision.

Which means they will behave exactly like objects with equal masses.

Let's imagine, for example, that we are airline pilots and our task is to fly as quickly as possible from Warsaw, Poland to San Francisco. We take a world map and knowing from Euclidean geometry that a straight line is the shortest path between two points, we draw such a line from Warsaw to San Francisco. We're getting ready to depart and fly along the course we plotted... but fortunately, our navigator friend tells us that we fell into a trap.

The trap is that the surface of the Earth isn't flat! The map we used to plot our straight line course is just a projection of a surface that is close to spherical in reality. Because of that, the red line on the map below is not the shortest path - the purple line is:

It might be a bit clearer if we take a look at the Earth in a spherical form:

Let's note that the purple line (or the black one in the second picture) is still straight in some sense. If we get on a plane and start flying straight ahead, we'll be flying along this path. We won't have to make any turns.

Such lines - analogous to straight lines, but on curved surfaces - are called *geodesics* and we'll talk a bit about plotting (or maybe rather: calculating) them here.

But before we dive into the details, we have to expand our conceptual apparatus a bit, so that we have the right vocabulary to talk about such general notions/spaces.

Let's start by looking at generalizations of the Euclidean space called* manifolds*.

What is a manifold? It's just a set of points, which locally resembles a Euclidean space. What does it mean? To put it in simple terms, it means that if we choose a point on our manifold and look at its close neighborhood, it will look like a Euclidean space, that is: it will be a good approximation to talk about straight lines and other shapes known from Euclidean geometry in this neighborhood. So, basically, it's possible to use Euclidean geometry in small neighborhoods of points of a manifold.

(How small does the "small neighborhood" need to be? Mathematicians use the notion of a limit in such cases. Simply put, the smaller the neighborhood, the closer its geometry will be to Euclidean geometry. Strictly speaking, it might only become exactly Euclidean when the size of the neighborhood is zero - which might not seem particularly useful, because we're talking about a single point then - but it turns out that it's enough for many interesting purposes.)

What can you do with manifolds? Primarily, you can introduce coordinate systems on it, which are called *charts* in this context. A chart is a function mapping a subset of the manifold (or, sometimes, the whole manifold) to a subset of a Euclidean space, usually identified with . What it means that we assign real numbers to every point of some part of our manifold - which is exactly the same as when coordinate systems are introduced on a plane, or in a Euclidean space. is the dimension of the manifold here - just like Euclidean spaces, manifolds can have arbitrary dimensions.

It might be that the shape of the manifold is so complex that it's impossible to define a chart covering all of it. It's not a problem as long as every part can be covered by *some* chart, and parts covered by different charts have some points in common. One can then describe different parts of the manifold in different charts, and translate the description from one chart to another in the common parts. A set of charts defined on a manifold is called an *atlas*.

The functions mapping one chart to another on a part of the manifold where multiple charts are defined, are called *transition maps*.

Because a chart is *de facto* a coordinate system, I'll just write about coordinate systems and transformations between them in the later part of the article.

Note that transition maps (the transformations between coordinate systems) are functions from to . Such functions are sometimes differentiable. Exactly the case when they are is the subject of interest of *differential geometry*. The manifold is then called a *differentiable manifold*.

Once we have a differential manifold and coordinates on it, we can also talk about vector, covector and tensor fields, and do various interesting things. But in order to talk about geodesics, we need another piece of the puzzle - *the metric*. Briefly speaking, a metric is something that defines distances between the points of our manifold, also in a sense defining its shape this way. A 2-dimensional manifold without a metric can be anything. It's the metric that allows us to distinguish between a part of a plane and a part of a sphere, or a part of a hyperbolic paraboloid.

I wrote a bit more about the metric and other notions related to differential geometry in the series Mathematics of black holes. It's unfinished, but I still recommend reading it before proceeding to the next part of this article. If you are not familiar with differential geometry, it should clarify some notions and notation I will be using later in the article. (This article will probably become a part of the series at some point, but I might have to rethink it first.)

Before we get to geodesics, let's consider how to describe simple straight lines in Euclidean geometry, but using the notions introduced above.

So, let's assume that our manifold is a simple Euclidean plane. We have Cartesian coordinates on that plane, and the metric in these coordinates is just the identity matrix (see Part 3 - the metric). Say that we also have a curve given as a function (also called the parametric form). How can we tell whether this curve is actually a straight line?

Let's consider what is the simplest way of describing a straight line in the parametric form. We want to get different points on a straight line for different values of . We can achieve that by choosing a point on the line and translate it by some vector along the line, depending on the value of . It could look like this, for example:

(The similarity to the equation of straight, uniform motion is not a coincidence ;) )

Using a notation more similar to the one usually used in the context of differential geometry, we could also write it like this:

This notation has a nice property - a complete lack of assumptions regarding the number of dimensions. This equation will look exactly the same on a plane, as in a 16-dimensional space. It's only the question of what the range of the index is.

What can we see here? The general parametric equation of a straight line is just a linear function of the parameter (or, more accurately: linear functions, one per each coordinate). There is a simple equation, the solutions of which are all linear functions and only linear functions. This equation is:

Or: the second derivatives of the coordinates with respect to the parameter vanish.

If we denote the derivative with respect to with a dot above the variable, which is a commonly used convention, the equation will look as follows:

It's easy to see that this is equivalent to the straight line equation above. If we take that equation and calculate the first derivative, the constant will vanish, and the term will get reduced to - a constant, which will vanish when we calculate the second derivative. This means that the straight line equation satisfies this differential equation.

And the other way round: if the second derivative of is zero, it means that the first derivative is a constant, and itself is a linear function of .

This is then what the general equation of a straight line looks like in Euclidean geometry. Short and to the point.

(Small note: this is the equation of a straight line in so called *affine parameterization*, which means that is proportional to the distance from the point at . Other parameterizations are possible, in which the second derivatives of the coordinates don't necessarily vanish. However, it would complicate the reasoning, and every parameterization can be transformed into an affine parameterization, so I'll focus on this case only.)

We will now see how to start from this equation, and eventually get the general equation for geodesics.

After the warm-up in the Euclidean space, it is time to look at arbitrary manifolds. Let's assume that we have some manifold, a coordinate system and a metric expressed in these coordinates . We are given a curve and we have to tell whether it is a geodesic.

A question that immediately comes to mind is: but how is a geodesic actually defined?

We'll exploit the fact that every manifold locally resembles Euclidean space. And if it locally resembles Euclidean space, then we can introduce coordinates resembling Cartesian coordinates on its small subset. Therefore, we'll say this: **a curve is a geodesic (in an affine parameterization), if for every point on this curve, if we introduce coordinates resembling Cartesian coordinates in the neighborhood of that point , then in these coordinates the curve satisfies the equation .**

So, in simpler terms: our curve is a geodesic if, when we take a point on it and look at a small neighborhood of this point - small enough for it to resemble Euclidean space - then in this neighborhood our curve resembles a straight line.

As it turns out, this is enough to get the geodesic equation. We'll just need to specify some things in more detail, primarily: what does "coordinates resembling Cartesian coordinates" mean?

We'll define it using the metric. We'll say that coordinates locally resemble Cartesian coordinates, if:

- the metric expressed in these coordinates (let's denote it by ) is equal to the identity matrix in the point that is of interest (let's denote it by )
- the derivatives of the metric at the point are 0.

Or, in equations:

As it turns out, it is always possible to choose coordinates such that these conditions are satisfied at a single point. If it's possible to choose ones such that the conditions are satisfied everywhere, then our manifold is a Euclidean space and our coordinates are Cartesian coordinates.

Since we will be operating on two coordinate systems in the later part of the article, it is worth reminding ourselves what transformations between coordinate systems look like, especially for points, vectors and the metric.

A coordinate transformation is given as a set of functions: every coordinate of one of the systems is expressed as a function of the coordinates of the other system:

For example, we can have two coordinate systems on a plane: Cartesian and polar . The transformations look like this, then:

Thus, if we know the coordinates of a point in one system, we just use these functions to transform them to the other coordinate system.

How about vectors? Imagine that we have a vector expressed in coordinates and we want to express it as in coordinates . Let's remember that a vector as a full mathematical object is actually a differential operator, in this case: (see Part 2 - coordinates, vectors and the summation convention). When we express it in coordinates as , it is still the same vector. So:

As the next step, we "move" to the left side of the equation. Such an operation doesn't actually exist as a correct mathematical operation, but it's a useful mnemonic for remembering how to get a correct result - because the result below turns out to be correct:

Back to the example with a plane and polar and Cartesian coordinates: if , are the coordinates of a vector in Cartesian coordinates, and we want to calculate the polar coordinates (, ), we can do it like this:

Let's note that the result will depend on which point we are performing the transformation at. The vector in Cartesian coordinates will have different polar coordinates depend on which point it is bound to! Because of that, every transformation should be understood either as performed at a specific point, or as a function of coordinates (in one system or the other).

Finally, the metric. We'll use a similar trick here as with the vectors, that is, we'll notice that the full geometrical object is actually , which has to be equal to . Hence:

So, for example, if we have the metric in Cartesian coordinates and we want to transform it into polar coordinates, it will look like this:

...etc.

I recommend completing this calculation, knowing that the metric in Cartesian coordinates is the identity matrix (that is, , ), as an exercise for the Reader.

As a final note: is a matrix (with values depending on the coordinates), called the jacobian of the transformation. Derivatives of the transformation in the opposite direction - - constitute the inverse matrix. This means that:

where is a so-called Kronecker delta - an identity matrix. (It is different from the metric in Cartesian coordinates in that it has one upper and one lower index, while the metric has two lower indices. This makes it an identity matrix in *every* coordinate system. The reason why that is is beyond the scope of this article - what's important for us is that this is true.)

The Kronecker delta multiplied by some other value only "changes the index", that is, for example:

Let's go back to our curve on the manifold. We have coordinates and the metric expressed in these coordinates , and coordinates and the same metric expressed in these coordinates, .

According to our definition, the geodesic equation in the coordinates is:

We shall now try to express it in coordinates , which are our initial coordinates (reminder, are just local coordinates, the ones resembling Cartesian ones).

The first derivatives of the coordinates of points on the curve: constitute the vector tangent to the curve. Since it's a vector, we know how to express it in coordinates :

Let's calculate the derivatives of that with respect to :

How do we calculate the derivative of the term ? Remember that this expression depends on the coordinates (be it or ). The coordinates are, on the other hand, some functions of along our curve. This means that we can use the chain rule and calculate derivatives with respect to coordinates, and multiply by derivatives of coordinates with respect to :

Hence:

The second derivatives of are 0 along a geodesic, so we get (switching the notation again to the "dotted" one):

Let's also multiply both sides by :

Now, remember that and that , which gives us:

Okay. We got some equation for the coordinates , but there are still derivatives of the transformations between and and vice versa in there. Can we get rid of completely? Turns out that we can. We need to use the metric for that.

Let's remember that we still have the metric expressed in coordinates as and in coordinates as . Because it's the same metric, just in different coordinates, we can write:

Let's calculate the derivatives of the metric with respect to coordinates :

We get:

But wait - we assumed that the derivatives of the metric are 0! (Well, we assumed that about the derivatives with respect to , but if all the derivatives are 0 in one coordinate system, they are also 0 in all coordinate systems - I recommend checking that as an exercise.) This means that the whole first term vanishes and we get:

(We will now denote as for the sake of brevity.)

Let's express using , now:

After a substitution:

Let's remember again that opposite jacobians give the Kronecker delta when multiplied:

Hence:

Let's define and . Then:

And the geodesic equation takes the form:

Let us also note that , that is, it is symmetric in the last two indices. That's because in the expression for this value, these indices only appear in the second derivatives of , and derivatives are symmetric with respect to the order of differentiation, that is:

We are now just a single step from the finish line. The only thing that's left is to write an expression for , that doesn't contain coordinates. We will get that thanks to the following equations:

And with shuffled indices:

Adding the first two equations and subtracting the third one, we get:

Hence:

is called the inverse metric. It is the matrix that is inverse to the metric, that is, one such that .

As an aside, are called *Christoffel symbols* - they are a certain measure of how curvilinear the coordinate system is, and by calculating their derivatives one can get the Riemann curvature tensor, which quantifies the curvature of the manifold. However, these are topics that are far beyond the scope of this article.

Thus, based on the intuition that a geodesic should be something analogous to a straight line on an arbitrary surface, we eventually got the equation:

where:

There are no coordinates in these equations. This means that we can now find geodesics by knowing only the shape of the surface, given as a metric in an arbitrary coordinate system.

As a final example, I will show how to use these equations to find the equations for geodesics on a sphere (called orthodromes). We will use the geographic coordinates, that is ( - latitude, from to , - longitude, from to ), in which the metric of a sphere looks like the following:

is the radius of the sphere here.

The inverse metric is:

The derivatives of the metric - only the isn't constant and it only depends on , so the only nonzero derivative among 8 possible ones is:

Nonzero Christoffel symbols with the lowered index are only those where occurs twice, and once:

What's left is to raise the first index:

We can now write the geodesic equations:

Or, written differently:

Using these equations, it's possible to plot an orthodrome, knowing the initial position and heading.

This article was intended to show that it's possible to derive the general geodesic equation using only the basics of differential geometry and the intuition about geodesics - that is, that they are lines that are "as straight as possible", which I tried to formalize a bit as satisfying the straight line equation in coordinates "locally resembling Cartesian coordinates". Whether I succeeded - it's not for me to judge, but I'll gladly learn from the comments :) I also encourage asking questions if something is unclear, and of course pointing out mistakes, which I probably didn't manage to avoid completely :)

]]>We - people - are surrounded by a certain reality. Since times immemorial some people have been noticing that there are some patterns to this reality, that it seems to follow some rules. Some of them got interested enough by this that they wanted to know more. They wanted to understand what the surrounding world actually is and how it works.

And here is where the problems start. The only tool the ancient people had that could help them tackle the problem of uncovering the rules of the world, was their intuition. And that intuition, which is a pretty good evolutionary adaptation to the environment our ancestors lived in, tends to fail spectacularly when applied to problems that weren't a part of that environment. The riddle you can try to solve here serves as a good illustration - I recommend having a go before reading the rest of the article, because it will become much less interesting afterwards.

Noticing that intuition can be deceiving and finding an effective way of counteracting this deception took the humanity a very long time. The result is the scientific method.

The formal definition can be found on Wikipedia, but to put it shortly - it is a set of methods of studying the world that aims to obtain results that are as objective as possible. There are two main approaches to studying the world and we will start with describing these approaches.

Let us start with a bit about the experiments, as they are more intuitive.

Simply put, the experimental study of reality consists of gathering data by checking the reality's "answers" to some conditions "posed" to it. The conditions might be "posed" by an observer (eg. in a laboratory), or they can exist naturally (like in astronomical observations, where we don't control the celestial objects at all). We study what happens under what conditions and document it thoroughly, and we get to know some part of the reality this way.

A simple example: our study of reality could consist of checking what happens when we throw some objects in the air. We threw a ball - it fell. We threw a rock - it fell. We threw a fork - it fell. Now we have some data about the reality.

But the experiments themselves are not everything. Studying the reality only by experimenting resembles learning a subject at school by memorizing the textbook - we can answer some questions afterwards, but can we really say that we understand the subject? In order to bring our understanding of reality to the next level, we construct theories.

Theories are systems enabling predicting the reality's behavior under some circumstances. They are tools that give answers to questions like "if the conditions are so and so, what will happen?". Theories give structure to experimental data and enable drawing conclusions beyond just that which has been directly studied experimentally.

In our example, a simple theory could be "objects thrown in the air fall back down". The experimental data has shown that when we threw a ball, a rock or a fork, they fell, which made us attempt to formulate such a theory. If somebody asks us afterwards "and what happens if I throw a pen?", we can answer: "according to our theory, it will fall". We haven't studied this case experimentally yet, but we can reason about it based on our theory.

This begs the question: how do we know that our conclusion from the theory will be correct? That if we perform an experiment we haven't tried before, the result will be what the theory predicts? The answer is: we can't possibly know that! And this is the crux of many misunderstandings.

Can we test our theory in any way? Can we make sure that objects thrown in the air, in fact, always fall? Can we ever be sure that this is how reality works?

Unfortunately, we cannot. No matter how many different objects we throw and observe falling, another one can always do something else. Of course, the more objects we try to throw, the more certain we will get that the theory is correct, as long as the results agree with it. However, our certainty can never reach 100%.

No scientific theory can be completely *verified*. Every scientific theory can be *falsified*, though. What is more - falsifiability is often considered a criterion a theory must satisfy to even be considered scientific.

In our example it is enough that a single object thrown in the air doesn't fall, and we will know that our theory can't be a correct description of reality. It is falsifiable, then. Can we falsify it in practice?

Well, it would be enough that someone would hand us a balloon filled with helium to throw, or that we would try to perform our experiments with throwing objects in a falling elevator, for example. In both cases the objects won't fall, falsifying our simple theory.

Does it mean, then, that our theory is useless and should be scrapped? No! It is still correct *in some domain*. We just need to specify the conditions for its applicability, eg. "objects heavier than air thrown by an observer standing on the ground fall" (of course, in order to come up with a reasonable set of conditions, not having any prior knowledge, we would have to perform tens, hunders, thousands of experiments first). This way we can obtain a theory fitting all experimental results known to us.

Is it already the final theory? And what have I said earlier about complete verification? Someone could still hand us a balloon filled with helium in a vacuum chamber - and suddenly it will turn out that this theory has holes, too. We could again identify the conditions that made it incorrect and amend it further, though.

In our example, we formulated our theory after performing a few experiments. We gathered some data about reality, noticed a pattern, proposed a general rule. Are theories always created this way?

Not necessarily. If someone, for example, started with an idea, that material objects tend to find themselves on the ground, they could create the same theory. Someone else could simply dream the idea for this theory. In both cases the final result is the same sentence we phrased after our initial observations: objects thrown in the air fall". Are these two theories somehow worse because of where they come from? No! The only thing that decides the value of a theory is how well it predicts the experimental results, not how it came to be. If we have a system that can predict the numbers in a lottery, we will be just as rich if we derived it from the Bible, as if we derived it from detailed observations of previous lottery drawings.

Building theories on ideas or thought experiments instead of on experimental results is actually a common approach in physics. Of course, such theories have to be confronted with experimental results all the same - as I mentioned before, it's how well the predictions match the experiments that decides the value of a theory, and this can only be checked by actually performing experiments. Nevertheless, theories that are based on a single specific idea, or a few of them, and predicting results well at the same time, are considered particularly aesthetic (Einstein's theory of relativity is a great example here). Most theoretical physicists dream of finding a single idea that would allow them to derive a theory correctly predicting every possible experimental result.

It is worth making a note at this point about how far we can go in drawing conclusions about reality from a theory.

Imagine that someone postulated an idea that is saying something about reality itself, eg. "the Universe started with the Big Bang". Having such an idea, we can try figuring out various consequences it would have for the actual shape of reality, eg. that space should be constantly expanding, or that there should exist a microwave background radiation, etc. In actual science, both ideas and their consequences are usually expressed in mathematical language, for the sake of maximal precision.

Some of the consequences derived this way will be propositions that can be directly, experimentally tested. It will be possible to check if reality really looks the way it should if the given idea was correct. Let us assume, then, that the tests were performed and all of them showed that reality looks exactly the way it should if the idea was correct. In our example: we can detect that other galaxies are redshifted, and that there exists a cosmic microwave background - and both effects were actually detected. This means that we are observing a reality that we would expect if there actually was a Big Bang in the beginning.

Does this mean, that the reality *actually is* the way the idea tells us? If all experimental results match the idea of an initial Big Bang, then there *really was* a Big Bang?

Strictly speaking - no. Drawing conclusions this way is a logical fallacy called affirming the consequent. Correctness of the idea implies such a reality as we are observing - this doesn't mean, though, that such observations imply the correctness of the idea. I'm writing "strictly speaking", though. In practice, so many experiments are being (and have been) performed, that if a theory derived from an idea matches all of them, it is really hard to imagine another theory, matching the experiments equally well, that would contain the negation of the idea. In our Big Bang example, almost all - if not all - the effects implied by the Big Bang have been detected, so it is really hard to imagine an equally good theory that would not assume the Big Bang, or would even assume a lack of it. *A priori*, it is possible on a purely logical level - at least until we have an actual proof that the negation of the idea in consideration (the lack of a Big Bang) unequivocally contradicts reality. Nevertheless, even lacking such a proof, an idea can often be considered well-founded - with a caveat that we will cease to consider that if a counterexample is found. So, for the moment we deem the Big Bang as having actually happened, but we are open to the idea that another explanation might arise that would not require a Big Bang.

Neither experiments, nor theories give us any direct information about the objective reality. Obtaining such information is probably impossible, anyway - since every observer has no choice but to study it through their own subjective perception. For this reason, the scientific method focuses on *intersubjective verifiability* instead - that is, such a presentation of theories and experimental results which allows independent observers to check them and come to the same conclusions, each in their own subjective perspective.

The issue of intersubjective verifiability is simpler for theories. What is needed is an expression of the theory that allows other to independently derive the same predictions from it. This is usually achieved by using rigorous, precise language in formulating theories - often mathematics. If people other than the author can understand which reasonings are correct within the theory, and which are not, the goal has been achieved.

There is a slightly bigger problem with this in the case of experiments. Precise language is also required - it is necessary for other people to be able to recreate both our experimental setup, and the conditions in which we were performing the experiment. But there is still one element missing, and it is specifying what results can be considered the same, and what results can't.

Imagine, for example, that we are trying to measure the Earth's gravitational acceleration with a pendulum - it's a pretty simple experiment, it boils down to measuring the length of the pendulum and its period of oscillations, and using a simple formula. Imagine that we performed the experiment and got a result of 9.8 m/s², and our friend performed it, too, but they got 9.83 m/s². So what now? Does it mean that only one of us got the correct result? Or maybe neither of us? Have we forgotten to take some factor into account...?

The answer is: it depends. It depends on how accurately we measured the length of the pendulum and its period of oscillations. No measurement is perfectly accurate - the instruments have their limitations, and every measurement is distorted by random factors that are impossible to take into account. All of this means that every experimental result has a corresponding *uncertainty*. The analysis of uncertainties is an important part of the job of every experimental scientist.

When we account for the accuracy of the instruments and other factors, it might turn out that our result with its uncertainty is 9.8 ± 0.1 m/s², and our friend's - 9.83 ± 0.15 m/s². This would mean that not only are our results not contradictory, they are even very much in agreement!

The uncertainties also play a very important role in comparing the experimental results with theoretical predictions. If a theory predicts an acceleration of 9.81 m/s², and we got 9.8 m/s², that doesn't mean that the theory has been falsified yet! If that was 9.800 ± 0.001 m/s², and the theory predicts 9.8132 ± 0.0003 m/s² (yes, the predictions can have uncertainties, too - they are often based on experimental results that have uncertainties themselves), then the theory would be in trouble. But if it is 9.8 ± 0.1 m/s², as in our example, then it is a result matching the theory.

An important part of the scientific method is counteracting the influence of various psychological effects on theoretical predictions and experimental results.

Let us say that we made the measurements of the gravitational acceleration with a pendulum and we got 9.7 ± 0.1 m/s². Then we calculate it from theory and we find that it should be 9.81 m/s². But, we note that we might have measured the length wrong, and by the way, we probably turned the timer on a bit late, and if we just modify some numbers within the error boundaries, we will get 9.75 ± 0.11 m/s². Then we proudly announce that the theory matches our experiment.

All is well until someone else comes, makes the same measurement, gets a non-matching result, and it makes them find the existence of a factor that we missed and nobody else has heard of before. And thus we missed a chance for a huge discovery.

Or another example: consider a simple theory that says "a green color of an object causes it to fall when it is thrown". We set out to test it. We take a few green objects, throw them, they fall. We are proud of our confirmation of the theory. Someone else comes, throws a red object, it falls. Throws a blue one, it falls. Something doesn't really add up here.

These are just two examples of mind traps that one might fall into. A scientist has to be aware of them and actively counteract the possibility of falling victim to them.

The first described effect is usually counteracted by making predictions in advance. Correctness of a theory is studied by first calculating its predictions, and only then performing the experiment. Then one can honestly check if the predictions match or not - and if not, look for the source of the error.

The second effect is called confirmation bias. People have the tendency to look for confirmations of their suppositions. But, if we are looking for a rule that would be as general as possible, we also have to make sure that the predictions are *not* a match under conditions when they shouldn't, and this part is often overlooked. This is actually the trap that many people fall into when faced with the riddle from the beginning of the article (if you haven't tried it yet - well, know you know what to look out for) and this is why it is important to try to *falsify* a theory when testing it, and not to try to confirm it.

There are many various effects of this kind, so it is impossible to describe all of them here. I won't even try, then - I'll just say that it is a good idea to research this topic before announcing a revolution in science.

I wrote a lot about what science is and how it works. There is also a phenomenon that tries to pass as science, but is not science - it's pseudoscience. What are the characteristics of pseudoscience? How can you tell it from science? This is a topic for a whole book, I'll just describe a few signs here that should raise red flags when you encounter them.

Pseudoscientists love anecdotal evidence - ie. stories that confirm their claims, but are either hard to verify, or their authenticity is of little importance to the subject at hand. An example of typical anecdotal evidence is "my aunt was taking homeopathic medicine and was cured", or "here on website X a random man described how he performed an experiment and got a result contradicting a well-established theory". In both cases we know nothing about reproducibility of the result - it could be caused by an unknown factor, a random fluctuation, or it could be a straight up lie. In case of medicine, its effectiveness is tested using statistical methods in controlled trials (because even effective medicine doesn't provide a 100% certainty of success in therapy). In the second example, a few independent confirmations would be needed to acknowledge the result - especially if it contradicts the rest of scientific knowledge.

A typical move of pseudoscientists is to focus on results that seem to confirm their results, and completely ignoring those that contradict them, no matter how many there are. If somebody ignores results of repeated studies that are inconvenient to them - it's a solid indicator against their often declared scientific approach.

Pseudoscientists like to propose theories that sound reasonable and explain the observations at the first glance - but if you look deeper, they would explain *any* observation. Or, in other words - there is no observation imaginable that would prove their theory wrong. No matter what is observed, the theory explains it. "A negative result? I'm right. A positive result? I'm also right."

It is easy to identify such a theory by the impossibility of deriving any predictions from it. Since any result would agree with it, there is no telling which of the agreeing results will happen in reality.

This one is particularly amusing, because it is a classical projections of one's own shortcomings on the opposite side.

A typical accusation of being unscientific is based on the fact that a theory is not the only one possible explaining an observation. It stems from a mistaken belief, or purposefully wrong allegation, that scientific evidence cannot admit more than one interpretation. This is obviously absurd. A theory isn't scientific because it is the only one that can explain every single observation, and an obsevation isn't scientific because it only has a single theoretical interpretation. A theory must be falsifiable and match all known experiments (where applicable); an experiment must be reproducible and have rigorously analysed uncertainties. That's it for the scientific requirements.

As a closing remark, I wanted to touch upon another topic that often likes to appear in the context of evolution, and that is often explained completely wrong.

In discussions about evolution, its opponents often like to raise the "argument": "evolution is just a theory". A common answer is to state that this mixes the colloquial meaning of the word "theory" with its scientific meaning; that the colloquial "theory" is closer to scientific "hypothesis", and that scientific "theory" is a hypothesis confirmed with observations. The first part is admittedly correct, but the second part is totally wrong.

I wrote what a theory is at the beginning - it is a system allowing to make predictions about reality. There is nothing in the meaning of the word even remotely resembling being confirmed! A theory is a theory regardless of whether it matches experiments (it is "correct"), or whether it does not (it is "wrong").

It is true that the colloquial usage of the word "theory" is as a synonym for a "hypothesis", a "supposition", and that arguing by "it's just a theory" is a simple equivocation. It's not true, though, that a theory is somehow a next stage in the development of a hypothesis, which is reached when the hypothesis is confirmed. Being a hypothesis and being a theory are two independent things. Presenting a new theory is usually at the same time a hypothesis that it is correct, that it accurately describes reality - but this is where relationships between the two end.

It is also worth reiterating that a theory can never be confirmed. It can only be not falsified. If a theory can't be falsified, even though there were attempts - it is considered a good theory.

Well, this came out longer than I expected. I hope that I managed to explain the essence and the sense of science at least a bit, and show what the scientific method is about. The awareness of these issues is particularly important now, when everybody can go and publish whatever in the internet, and pretend to be a scientist even if they know nothing about the topic. This text is supposed to be a kind of a vaccine against such people - it should provide the Reader with knowledge allowing them to recognize if someone is really presenting something scientific, or if they are only pretending. If this is achieved - awesome. If not - well, I just hope that there was something valuable in it regardless :)

Let me just recap quickly on what the discussion concerned. It is that one of the flat-Earthers insists that some landscapes look the way they should on a flat Earth, and not how they should on a spherical Earth. He supports his claims by showing some photos he took and calculating some proportions of distances between characteristic points or sizes of some visible objects. It's actually a very reasonable approach - provided that one does everything earnestly, ie. calculates what proportions one should get on a flat Earth, and what they should be on spherical Earth. As it turns out - which is what the previous entry was about - that a fully correct analysis must even take atmospheric refraction into account, and it is negligible for most purposes.

The refraction calculator I created on this occasion had one major drawback - it allowed only for tracing a single light ray at a time. Because of this, for every photo you had to choose some specific points and calculate e.g. ratios of some angles. This actually still enables getting some interesting results, but isn't very attractive visually - it's just comparing numbers. So I came up with an idea of using computers to improve the situation a bit: what if I could create software that would simulate multiple rays at once, instead of just one, check where they hit the Earth's surface and generate a whole panorama based on that...?

I had one piece of the puzzle ready: path calculations for a single light ray both on a spherical and a flat Earth. The code calculating the paths was admittedly a part of the refraction calculator, but it was possible to extract it into a separate library with relative ease. I still needed something that could load terrain data from somewhere (the program needs to know somehow when a ray hits land or water) and I had to write the image generator itself.

Writing this eventually took a dozen or so hours, but, as is usually the case with me, I finished the work over two months after starting it.

The terrain data is being loaded from the DTED (Digital Terrain Elevation Data) files provided to the program. Such files can be downloaded for free from the Internet, for example from the USGS website: https://earthexplorer.usgs.gov. Every file on the website covers an area corresponding to 1 degree in latitude and longitude. In my case it caused a small technical issue.

The problem is that my generator is supposed to be able to simulate a flat Earth, and the data is indexed with latitude and longitude - values strictly related to a spherical shape. The Earth is, in fact, spherical, so there is no working around that. I had to find some kind of a mapping of spherical coordinates on a flat surface. I eventually settled on treating every data fragment of 1 degree x 1 degree as a rectangle with sides of approx. 111 km (the distance between parallels separated by 1 degree of latitude) by 111 x cos(latitude) km (the distance between meridians separated by 1 degree of longitude at a given latitude). This, of course, distorts some directions and distances, but I had no better idea. If any flat-Earther is reading this and has a better one - I'm open to suggestions.

Anyway, the program works. It's not blazingly fast, it takes about 10 minutes on my 8-core computer to generate a 960x600 image, but it manages. The code is available on GitHub: https://github.com/fizyk20/atm-raytracer

Let us now take a look at a few example results.

The views being simulated are the same ones I was analysing in the previous entry using the refraction calculator: Schneeberg seen from Praded and the mountains in New Zealand.

Let us start with the mountains, as they are less spectacular, but still interesting.

A reminder of what the view looked like:

Here is the simulation on a spherical Earth:

And here is the simulation on a flat Earth:

I already marked some features of the second simulation that disagree with the actual photo.

The arrow on the right points to a peak that is invisible in the photo - because it is hidden behind something in a nearer plane - exactly like in the spherical simulation.

The rectangle, on the other hand, marks a mountain range in the front that behaves incorrectly. If you take a close look at the photo and the spherical simulation, you can see that the range gets lower and lower until almost the very water boundary. On the flat simulation, though, because there is no hiding behind the horizon, an analogous phenomenon doesn't happen and the range should be visible much higher.

A strong argument for the spherical Earth. Ironically, the picture comes from a video titled "1000% Flat Earth Proof" ;)

So now let us take a look at Schneeberg as seen from Praded:

Let us see what a simulation on a spherical Earth will show:

It looks very similar. You can see the same ridges that are present in the photo and the very peak of Schneeberg, also just like in the photo.

So what would the view look like on a flat Earth...?

Well... there is a difference. What is more, this view is zoomed out relative to the spherical Earth simulation (the horizontal field of view is 5 degrees, for the spherical one it was 2 degrees) - otherwise the mountains didn't even fit in the picture! For a better comparison, the reddish ridge in the middle is the same one that Schneeberg is peeking over in the spherical Earth simulation...

What is happening here? Again, the main reason for such a view is the lack of hiding behind the horizon. On a spherical Earth, the distance of 277 km lowers the mountains enough for their peaks to be below eye level for an observer on Praded. There is no such effect on a flat Earth, so peaks taller than Praded (and Schneeberg is 500 m taller) have to be above eye level.

By the way, this simulation is a nice illustration of the numbers obtained in the previous entry. The calculations showed that the visible part of Schneeberg should be about 0,075 degrees in size on a spherical Earth, and over 0,9 degrees on a flat Earth. This huge discrepancy is clearly visible in the simulations.

What is the verdict, then? As you can see, when you have something to compare the landscapes to, it is clear that the flat Earth claim is undefendable. Of course this doesn't bother eager flat-Earthers - but this was not about that. The main goal was to have fun writing the simulator, and being able to get interesting results out of it... that's just added value ;)

I created a few more comparisons of simulations and real views in the form of video clips. They are shown below.

Schneeberg:

New Zealand:

I also created simulations of one more panorama - a photographer Witold Ochał managed to catch a view of the Tatry mountains from a village called Szkodna in Podkarpacie, Poland. Here are the results of comparing the simulations to reality:

I think that the video clips above speak for themselves.

As the Special Theory of Relativity seems to contradict the common sense, it remains a somewhat magical topic for the regular people. The consequences of this theory seem to be so far removed from everyday life, that it's quite hard to admit them as the correct description of the surrounding reality.

Most people have their first contact with SR at school and its introduction there looks somewhat like this: near the end of the 19th century people discovered the electromagnetic waves. The equations describing these waves imply a specific speed of their propagation, denoted and equal to about 300 000 km/s. It was quite interesting, since nothing seemed to imply any frame of reference for this speed. Since all known waves required a medium to propagate, it was assumed that the electromagnetic waves are no different and travel in something called the aether, and that the speed arising from the equations is relative to the aether.

Once people decided that aether should exist, the logical next step was to try and detect it. One of the ideas was to measure the speed of the Earth relative to the aether. Some attempts were made, but the results were unexpected - it seemed that the Earth is not moving in the aether. It was strange, especially considering that the Earth changes its velocity in its motion around the Sun, so even if it did stand still in the aether at one point, it shouldn't at another one - but the measured speed was always 0. People then tried to modify the concept of the aether to explain the results and started performing more sensistive experiments. One of these was the famous Michelson-Morley experiment, which, just like the earlier attempts, failed to detect the motion of the Earth, too.

The scientists were rather confused with these results. It seemed that the speed of light was constant regardless of the motion of the observer, which was quite extraordinary. To better illustrate what is so strange about this situation, let us imagine that we are in a car standing at an intersection, and that there is another car in front of us. Once the traffic light turns green, the car in front of us starts moving and accelerates to 15 m/s, so its distance from us starts to grow by 15 meters every second. We start moving shortly afterwards. Once we are moving at 5 m/s, we expect the car ahead to be leaving us behind by 10 m every second, but once we check that, we are surprised to discover that the distance is still growing by 15 m/s. We accelerate to 10 m/s - and the distance is still growing by 15 m/s. We accelerate more and more, but we can't seem to start catching up to the car in front, even though our friend, a policeman, was standing with a radar near the road and told us that the speed of that car was always just 15 m/s. Light seemed to behave just like such a weird car.

The 20th century came and various people were proposing different explanations - among them were Lorentz, Poincare, and eventually Einstein. In 1905, Einstein presented a theory known today as Special Relativity, which was based on 3 assumptions:

- The space(time) is homogeneous and isotropic, ie. there are no special points or directions in the Universe.
- There are no special inertial frames of reference, the laws of physics are the same in all of them - this is the so called Galilean relativity principle.
- The speed of light is the same in all frames of reference - this was a conclusion from the Michelson-Morley experiment.

Thus the aether became unnecessary - from that moment on, was just a universal speed, independent of who is measuring it. Coincidentally, this also has some unusual consequences, such as time passing slower for moving observers, or contraction of moving objects.

There is still a loophole, though. One could argue - and some people do - that the third assumption is not adequately proven. The Michelson-Morley experiment could have been not sensitive enough, or it could give a null result under some specific circumstances, even though the speed of light is not really constant. Thus, SR can be (and, according to some, just is) wrong.

This is all true, but not many people are aware that this third assumption isn't actually needed to obtain SR. I'm going to show here how this is possible.

I'll just note here that the derivation below is heavily inspired by a lecture by prof. Andrzej Szymacha, which I actually attended during my first year of studies. He showed us a reasoning that is almost identical to what I'm going to present, but a bit more complex in my opinion, so I decided to make small modifications.

Let us outline the situation, then. Imagine that we have two observers, who we will denote and . Both of them assign their own coordinates to the events in spacetime - they are for , and for . Both of them find themselves at a point with spatial coordinates equal to 0 in their respective coordinate systems, that is, we have for , and for . We also assume that both observers met in a single point at time and that is moving at a speed of in direction in 's frame of reference, so in 's frame the coordinates of satisfy .

Since we are only really interested in two directions - one temporal and one spatial - we will forget about , , , . They play no role in the conclusions, and it will simplify the reasoning a lot.

Another huge simplification will be to assume that the axes and point in the opposite directions. This way the situations of and are perfectly symmetrical - both moves away from in the positive direction, and moves away from in the positive direction. This perfect symmetry lets us immediately conclude that 's speed has to be in 's frame of reference, as everything looks exactly the same regardless of which observer is marked as , and which one is .

Let us move on to some more mathematical issues. For starters, let us note that the homogeneity and isotropy of spacetime mean that the transformation between the frames of reference must be linear, ie. and can depend on at most the first powers of and . Why? If there were higher exponents in the equations, they would change their form in translations, that is, if we changed the choice of the point denotes as . We couldn't declare all points as equally good then, at least one would stand out - and we are assuming that it isn't so.

Linear transformations are pleasant in the way they can be written with matrices. We will then represent our transformation from to this way:

For the people not familiar with matrices - the notation above means exactly the same as this one:

Let us consider what we can deduce about the coefficients A, B, C, D.

First, since we know that the situation is symmetrical, we can immediately write:

The transformation from to has to be exactly the same as the one from to , because, as we mentioned, switching the observators' places changes nothing in the situation. Hence, we can write:

This simplifies to:

In order for everything to fit, the following must hold:

The equations 2 and 3 immediately lead to the conclusion that . The first and the fourth one are equivalent, then.

Denoting the transformation matrix as , we get:

What's next? Let us remember that we mentioned that is the same as . Since we can read from the matrix, we get:

Dividing by t, we will get . We can substitute this into and extract :

The transformation then takes the form:

This is a lot already, but we still don't know what is. In order to find out, we have to introduce some more complications.

First of all, let us give up on symmetry. The transformation in SR is usually written under the assumption that the axes and face the same direction. To achieve this, it is enough to flip the sign of . How do we do that?

Since we assumed , after flipping the sign we will get . So, in order to get the transformation with axes facing the same direction, it is enough to flip the signs of the bottom coefficients in the matrix. We will denote this "flipped" matrix as :

Let us also note that if we change the sign of the velocity (ie. it will be instead of ), we will get . If we now also flip the sign of , we are back in the same situation in 's frame (opposite speed and opposite axis, so the observer is moving away in the positive direction again). can't change, then. This means:

From this we can conclude that .

The second part of the whole ordeal is introducing a third observer. We will call him and we will say that he moves at a speed of relative to , ie. we have for . What is his speed relative to ? Let us denote it by , which will mean . In order to transition from to , we need to make a transformation by :

From this we get:

We substitute from the first equation to the second and we get:

Still with me? So now the other way round: moves at relative to , and at relative to , so we transform by from to :

So:

This gives:

Phew. We calculated in two ways. However, it is still the same , so both results must be the same. The denominators must be the same, then:

After subtracting 1 and dividing by we get:

So now we are reaching the climax. The left-hand side only depends on , and the right-hand side only on , which are two independent parameters. If we set a specific , the left-hand side will be determined, but is still subject to change. Despite that, the right-hand side cannot change, because it must still be equal to the other one. This means that both sides must be constant, equal to a number we will call :

Solving this for leads to the result:

We can thus write the final transformation:

All is great, but what exactly is ...?

Let us first consider the consequences of various possible values of .

This case is the simplest one. When , the transformation boils down to:

This is nothing else than the Galileo's transformation! So if the constant turns out to be zero, it will mean that people have known the correct transformation since the 17th century.

This is also an interesting case. When the value of is negative, we can assume that it's . The transformation looks like this, then:

Let us introduce new variables: and . We get then:

Let us define an angle such that . This reduces the transformation to:

But! We know from trigonometry that:

So, we get:

This is just a rotation matrix for the angle ! So, in the case of a negative , time is just another spatial direction, and changing the velocity by is a rotation by the angle of .

As it turns out, is actually positive in reality (I will tell you in a moment how we know). We can then denote , where is some constant in units of velocity. This constant has a special property. In order to see what it is, let us revisit the transformations of velocities.

We already transformed velocities when deriving the matrix coefficients. Let us do it again, then - assume that an object moves at a speed relative to (so it satisfies ) and see how it moves relative to :

Let us write again:

We get:

Dividing side by side, we get:

Let us see what happens when :

So, if an object moves at a speed relative to , it is also moving at relative to , regardless of what the relative speed of and is. is then a kind of a **universal speed**, independent of the frame of reference.

In the positive case we can also do a trick like what we did in the negative case, and introduce a value called "rapidity" such that: . Introducing, analogously, , we get:

It is a matrix of a transformation analogous to a rotation, but in a so-called Minkowski spacetime. I won't go into details here, but this idea turns out to be very useful in SR.

Now we know what different values of mean, but we still don't know what is its value in reality. We do have a nice description of the phenomena that should be happening for various values of , though, so we can try to measure it. Specifically, we know how to add velocities:

One of the first measurements of was done in 1851 by a French physicist Armand Fizeau, but he didn't know back then that such a constant can exist, nor that it can be deduced from his measurements ;) What he did was measure the speed of light in the air, in water and in flowing water.

The speed of light in water is , where is the refractive index of water. He expected to get a value of in water flowing with speed , according to Galileo's transformation, but he actually got . Let us see what we can deduce about from this.

If we assume that is small, we can approximate the formula for adding velocities:

where "..." stands for higher powers of , so numbers that are even smaller.

When , we get:

Since was a lot larger than in Fizeau's setup, this approximately equals:

For this to agree with Fizeau's results, it must be that . What is not very surprising, the assumption that the speed of light is universal gives exactly .

We got the Special Theory of Relativity without assuming that the speed of light is constant. To be precise, we got a result that there is a universal speed, which is approximately equal to the speed of light - but all experiments performed so far agree with the theory in which it is exactly the speed of light.

We have shown, then, that we can obtain SR not even assuming that the speed of light is the universal speed. Nevertheless, the experiments indicate that there indeed is a universal speed in nature and that it is the speed of light with very high accuracy (the original Fizeau experiment might not have been this precise, but 150 years have passed since then and we have much more precise results now). So even if it did turn out that the speed of light can depend on the frame of reference - which isn't entirely out of the question - it means nothing for phenomena that are such a pain to SR's opponents like time dilation and Lorentz contraction, or the existence of a universal speed. These phenomena arise from something much more general than just a constant speed of light, and in order to significantly change their interpretation, a discovery much larger than just variability of the speed of light would be needed.

It might be good to remember about this the next time you encounter someone who would try hard to convince you that SR is a scientific conspiracy ;)

]]>- What are events and spacetime?
- What are world lines?
- Simple spacetime diagrams
- How does the inseparability of space and time influence their perception by observers?

Most of the illustrations in the last article used rotations, but it turned out eventually that rotations aren't the correct transformations that would let us look at the spacetime from the point of view of different observers. Now we will take a look at transformations that actually describe reality - the Lorentz transformations.

Rotations are probably quite familiar to everyone. You can just grab an object and move it around, you can spin something on a stick, the wheels of a bike or merry-go-rounds are rotating. Everyone learns to recognize things regardless of how rotated they are since early childhood. We understand rotations intuitively and we know what to expect of them.

Nevertheless, in order to understand the similarities and differences between rotations and Lorentz transformations in more depth, we have to take a look at rotations on a slightly more abstract, mathematical level.

Actually, what are transformations, whether in space or in space-time? Well, simply put, a transformation is something that can take a point, let's call it A, and it will give us another point, let's call it A'. If we are on a plane, we can describe a given point for example with a pair of coordinates (x,y) - a transformation will change it into some (x', y'). In space-time we would usually describe a point with a set of four coordinates: (t, x, y, z), and this will become (t', x', y', z') after a transformation.

We can use formulas to describe a rotation on a plane by an angle the following way:

Two things are really important here:

- If and , then and - or, in other words, the rotation doesn't affect the origin. If we give the point (0,0) to a rotation to be transformed, it will give us back the same, unchanged (0,0).
- The distance of a point from the origin is the same before and after the rotation: (I recommend calculating this yourself from the equations above as an excercise - you just need to remember the Pythagorean trigonometric identity: ).

The Pythagorean theorem tells us that the distance of a point (x,y) from the origin (0,0) is . The value in the square root doesn't change under a rotation, so the point after the transformation will be at the same distance from (0,0). - To expand on the previous point somewhat - if we have two points A and B, whose coordinates differ by and , and we transform them into A' and B', whose coordinates will differ by and , then still .

The second bullet point above also means that rotations transform circles centered at (0,0) into themselves - however you rotate a circle, it looks the same. A circle is a set of points at a given distance from the center - so if we have a point on a circle, at a given distance from the center, it will be at the same distance from the center after the rotation, so also on the same circle. You can actually see that in the animation above.

Why am I writing about all this? It should become clear in a moment - when we start talking about Lorentz transformations.

The Lorentz transformations aren't as intuitive to people as rotations. In a sense, we also deal with them since childhood (they are the transformations describing the relationships between moving observers, after all), but it's definitely much less visible.

Full Lorentz tranformations work in a 4-dimensional space-time, but just like in the previous article, we will limit ourselves to two dimensions for simplicity - the dimensions being time and one spatial dimension. Such a 2-dimensional space-time is very similar to a plane, and you can also describe the points in it with two coordinates - but we will be using (t,x) instead of (x,y).

Just like with rotations, we can write formulas that describe Lorentz transformations. In the context of the theory of relativity they are usually written like below:

These equations contain the relative velocity of the observers, the speed of light and a lot of physics in general. For now, we will look at these transformations in a bit more abstract way, and we will write them this way:

We will not talk about what exactly is for now (it has something to do with the velocity of the observer), right now we will focus on the similarities with rotations.

And there are a lot of similarities, indeed! Like in rotations, sines and cosines appear - except hyperbolic, not "regular" (you can read more about hyperbolic functions here). Also like with rotations, the point (0,0) is transformed into (0,0) - the transformation doesn't touch it. And again like in rotation, there is a value associated with every point that the transformation doesn't change.

Let us remind ourselves: rotations didn't change the distance of points from the origin, nor, what follows, the square of the distance, equal to . According to the hint for the calculations, it is related to the Pythagorean trigonometric identity, which is the fact that for any angle , the equality holds.

Well, there is actually a hyperbolic identity as well: for any , the equality holds. And also because of that identity, the Lorentz transformations don't change the value (or, if we want to measure time and distance in different units: - the equations written above simply assume ). This value is called **the space-time interval**.

Just like in rotations, not only the interval between a point and the origin is conserved, but also between any two points: .

As it turns out, the space-time interval has many properties not unlike those of distance. The main difference is that the square of the distance between two different points is always positive - the interval, on the other hand, can be either positive, negative or even zero. Since it is conserved, any points separated by a positive interval will also be separated with a positive interval after the transformation - and the same holds for negative and zero intervals. What does it mean?

In order to solve this riddle, let us consider the meaning of a zero interval. Assume that we have two points, A: and B: , which have a zero interval between them:

Transforming this equation, we can get:

Let us remember that points in space-time are events. The event A happened in the place and time , and event B happened in place and time . is thus the distance between events A and B, and is the time that passed since A until B, or the other way round. Dividing the distance by the time we get the speed we would need to move at in order to cover this distance in this time - so in order to get from A to B, you have to move at - the speed of light.

And now the most important thing - the Lorentz transformations don't change the interval! This means that if we look from the point of view of a different observer - which corresponds to transforming A and B into A' and B' with a Lorentz transformation - the interval between A' and B' will also be zero! This means that if something moves at the speed of light in one frame of reference, it will be moving at the speed of light in all frames of reference. This is the famous invariance of the speed of light.

Let us take another look at the animated picture above. You can see two dark-yellow-brownish, oblique lines. These are the lines that correspond to moving at the speed of light. You can see that they are staying in place regardless of how the picture is transformed.

A similar thing applies to the cyan hyperbolas. Just as rotations don't affect circles, because they are the sets of points at a constant distance from the origin, the Lorentz transformations don't affect hyperbolas - the sets of points at a constant *interval* from the origin.

I'll refrain from going into the details of the analysis, but just like the lines correspond to a zero interval from the origin, the top and bottom hyperbolas correspond to positive intervals, and the left and right hyperbolas - to negative intervals. All in all, we can look at our space-time as divided into four quadrants with the light lines - all points in the upper and lower quadrant are separated by a positive interval from the origin, and all points in the left and right quadrants - by a negative interval.

Since the Lorentz transformations don't change the interval, no point from either quadrant can ever be transformed into a point in another one! This limitation becomes slightly weaker, though, if we add some spatial dimensions. Adding a second spatial dimension, which we can imagine as rotating the picture around the time axis (the vertical one), will change the light lines into a **light cone** and will divide the space-time into three regions instead of four quadrants.

These three regions are: the upper part of the cone - the future; the lower part of the cone - the past; and everything to the sides - so-called "elsewhere" - these are the events that can't be reached from the origin by moving at subluminal speeds.

One could ask - why the futue is not just the upper half of the diagram, and the past - the lower half? After all, the points in the upper half all have time coordinates greater than zero, and the lower half - below zero... It's a very good question.

Let us take another look at the animation, and specifically at what happens to the points in the left and right quadrants. The animation shows the points being transformed one way and the other way, in alternating cycles. As the transformation distorts the picture, pretty much every point in the left and right quadrants sometimes gets to the upper half, and sometimes to the lower. This means that a point with a negative time coordinate can get transformed into a point with a positive one, and vice versa - so it can get "moved" from the future into the past, or the other way round! It cannot be said then, that any of these points is in the future or in the past - it depends on the observer! This only applies to the points from "elsewhere", though - the points from the upper quadrant (the upper part of the cone) are in the future of all observers, and the points in the lower quadrant (lower part of the cone) - in the past of all observers (careful, though: of all observers *that are at (0,0)* - the observers in other points have their own cones, slightly shifted relative to this one).

I've said a lot about the Lorentz transformations so far, but nothing about how we know that they in fact govern our reality. Well - as you can expect, we have reasons to think that. It's not as if someone just came up with the idea and everyone just took it at their word. The problem is, it is pretty complicated to show where the transformations come from.

To be more precise - it's quite easy to derive Lorentz transformations once you assume that the speed of light is independent from the observer. This is how it was done in high school when I was a student (although I have no idea if it is still done this way, nor if it's even still a part of the school curriculum...). There are some further complications if we don't want to just believe that (even though the fact that the speed of light is constant is rather well documented) - some more effort is required then, but it is still possible; you can read more about it here.

That's all for this part. I'm still not sure what the next one will be about. The long-term plan was to move slowly towards explaining black holes and effects related to them, so the next post will probably be about curvature. Another possibility is a slightly deeper dive into Special Relativity - like analysing the twin paradox, for example. If you have some other topic you would like to see covered - leave a comment, and I'll be sure to consider it.

Any comments about the clarity of the text will also be appreciated! I'll gladly get to know what is not clear and improve it - I'd like the articles to be as easy to understand as possible.

Till the next time!

]]>It all began with two flat-earthers appearing on a certain forum. The exchange started with standard arguments like timezones, seasons, eclipses, the rotation of the sky... what have you. As usual in such cases, those arguments were met with silence or really far-fetched alternative explanations. I'll omit the details, interested people can find standard flat-earth arguments on the web.

Well, you can't sway a person that is completely confident in their beliefs with arguments, so the discussion has become somewhat futile. Both sides stuck to their positions and mulling over the same issues time and time again has started. That is, until one of the flat-earthers started presenting photos which, according to them, proved that the Earth "can't be a ball with a 6371-6378 km radius", with descriptions that can be expressed shortly as "explain THAT!". Alright.

The most interesting part was when they touched upon the issue of this observation of the Schneeberg mountain from the Praděd peak:

What is the problem? Well, let's look at some of the facts:

- Praděd has an elevation of 1491 m ASL, but it is reasonable to assume that the observation has been made from a viewing platform that is found at the peak, which has an elevation of 1565 m ASL.
- The Schneeberg mountain is as tall as 2070 m ASL.
- The distance from Praděd to Schneeberg is 277 km.
- There is a hill between Praděd and Schneeberg, approx. 73 km from the former, that has an elevation of 680 m ASL. (in the picture, it is the hill that has two wind turbines on top of it; the turbines are the two poles with red lights to the left of Schneeberg, in reality they are a little distance to the east of a Czech town of Protivanov).

And everything would be perfectly clear if it wasn't for the fourth fact. To show why, let us calculate how tall would Schneeberg have to be, so that the hill near Protivanov couldn't obscure it.

We will use the polar coordinates, assuming the Earth's radius of .

We have:

The equation of a line in polar coordinates is:

Substituting the coordinates of the first two points, we get and , then we calculate ... And what do we get?

As it turns out, a line starting at Praděd and tangent to the hill near Protivanov arrives at Schneeberg at the elevation of... about 2600 m ASL!

The hill near Protivanov should be obscuring Schneeberg. Schneeberg makes nothing of it and keeps being visible in the picture.

What is happening here?

Our flat-earther obviously concluded that this proves the flatness of the Earth. Objects wouldn't hide under the horizon on a flat Earth, so it would be no problem for Schneeberg to stick out from behind the hill near Protivanov.

Of course, there is another explanation, too, and it is the atmospheric refraction.

To give you some introduction - refraction is a topic that needs to be treated very carefully in the presence of flat-earthers. To them, it is a keyword that explains everything: timezones, seasons, the horizon... generally everything that is a good argument for the Earth's roundness. Something looks differently than it should on a flat Earth? Refraction! Of course it's no explanation at all - but if we don't accept refraction-based arguments from the other side, we need to be thorough when we need to use it ourselves. We wouldn't want to bring ourselves to their level, would we? ;)

So, in order not to leave any hole in the explanation, I set off to prepare a thorough, quantitative analysis.

Before I present the approach I took, let me explain one more thing - flat-earthers have the tendency to question everything they can't check themselves. So, even though atmospheric refraction is well-explored and measured, I decided to start from some more basic principles. It is hard to question the laws of optics, being confronted with them every day, and it is hard to deny that the air gets less dense with altitude. Thus, I assumed a simplified model:

- The air density decreases exponentially with altitude.
- The deviation of the air's refractive index from 1 is proportional to its density.
- And, of course - the Earth is a ball with a radius of 6378 km.

The first point basically means assuming this equation: .

The coefficient can be derived from the equation by tying the density of air to its pressure with the ideal gas equation. This leads to being equal to , where - the molar mass of the air, - the Earth's gravitational acceleration, - universal gas constant, - the temperature of the air. The constants can be found on Wikipedia, and we assume the temperature to be 273 K, which gives us .

Again on Wikipedia we can find the refractive index of the air at the pressure of 1 atmosphere and the temperature of 273 K, equal to 1.000293.

So, the assumption no. 2 is basically this: .

We now have a model of the atmosphere, but what's left is to see how the light propagates in such an atmosphere. We will use Fermat's principle for that.

The Fermat's principle states that the light takes the route between points A and B that minimises the optical length of the path (the integral of the refractive index). In other words, it can be expressed as follows:

This can be written in polar coordinates as:

If we assume that the path of our light ray can be expressed with a function (it is true as long as we don't consider vertical rays), this can be expressed with an integral over :

in this equation.

Problems of this kind can be solved with the Euler-Lagrange equation. If we assume , the Euler-Lagrange equation will look like the following:

Omitting the intermediate steps (those who know calculus can perform those steps themselves; those who don't wouldn't get much out of it, anyway ;)), the final result is this:

where .

We can do a quick sanity check now by checking the result for a constant . In such a case, and we get - an equation that is satisfied by a straight line . This is correct, at least.

This equation doesn't look like it could be easily transformed further, to put it lightly. But we are lucky in that we have the 21st century now, and we have computers, so why don't we do some numerical analysis? I decided to create a small application that calculates paths of the light rays based on this equation (links to the source code and the compiled binaries are at the end of this post).

The first test: we do the calculation for a ray that starts tangentially to the surface. Angles of deflection of such rays are important to astronomers and well-measured, the deflection in typical conditions is 34 arc-minutes. I tell the program to calculate the path up to the altitude of 200 km (high enough that the atmosphere shouldn't deflect it further) and get the result... 35 arc-minutes. Excellent for such a crude approximation!

Excited by this test, I decided to input the data from the Schneeberg case. The ray starts at 1565 m ASL, and we get the starting angle from the condition that it has to hit 680 m ASL at the distance of 73 km. What will be the altitude at 277 km? The result:

`$ atm-refraction --start-h 1565 --tgt-h 680 --tgt-dist 73 --output-dist 277 -v`

Ray parameters chosen:

Starting altitude: 1565 m ASL

Hits 680 m ASL at a distance of 73 km

```
```

`Altitude at distance 277 km: 1688.2650324094586`

*The ray will be at a bit less than 1700 m ASL at Schneeberg's distance!* This is almost 400 m below the peak, completely sufficient for the mountain to be visible! Success :D

We could stop at that, but our flat-earther decided to give me another challenge. "This video proves conclusively that the Earth is flat!", they wrote, supplying the following link: https://www.youtube.com/watch?v=oNdRhW1yQZ4.

For those that don't want to watch the video: the author shows the view of New Zealand's southern island from a bay near Wellington. Some peaks are visible. The author finds them on a map, gets distances and elevations, then compares the view with predictions from a flat-Earth and round-Earth models. He gets agreement with the flat model, and disagreement with the round one. But is that so...?

The peaks are color-coded, and their data is shown on the following frame from the video:

The video itself shows a concerning thing: the purple peak is marked as reaching 2362 m ASL, but later in the film the author shows a table with data, in which 2410 m is entered. Why? No idea.

Anyway, I decided to input this data into my program and get the viewing angles of the various peaks. Assuming the horizon is at 0 (which the author didn't do, by the way), we get the following results:

The purple, yellow, red and green peaks fit surprisingly well! We have some trouble with the cyan (which got merged with yellow) and the blue peaks. Encouraged by the good fit for the other peaks, I started suspecting that the author misidentified the cyan and blue peaks. I resolved to try and find them myself.

Unfortunately, finding them on the map of New Zealand is a Sisyphean task. There are a lot of small and large peaks in the area. I decided to get some help from a panorama generator at http://www.udeuschle.selfhost.pro/panoramas/makepanoramas_en.htm. This generator lets you select the place and direction of viewing, and then draws the simulated view.

It was pretty easy to find the cyan and blue peaks on the generated panorama:

As you can see, their distances and elevations differ slightly from the ones given by the video's author. Let us try those values in the simulator, now...

Fits perfectly :)

What is the conclusion of this story? Well, I have drawn two main ones:

1. Don't assume that an effect is negligible, unless you've checked it (by a calculation or an experiment).

2. The landscapes would look differently if there was no atmospheric refraction.

And, of course, nothing compares to the satisfaction from proving someone wrong with calculations :D

Finally, the promised links to the program:

The code: https://gitea.ebvalaim.net/ebvalaim/atm-refraction

(`atm-refraction --help`

prints the options list)

Download “atm-refraction - Linux”

atm-refraction-0.2.2-linux.zip – Downloaded 1945 times – 1.32 MBDownload “atm-refraction - Windows”

atm-refraction-0.2.2-win.zip – Downloaded 1920 times – 1.41 MBI came up with an idea of yet another test that could be conducted with the photo of Schneeberg.

It bases on the fact that the wind turbines from the photo (reminder: they are the two poles to the left of Schneeberg) can be found on Google Maps. They are exactly here:

Google Maps tells us that the distance between them is about 450 m. From the distance of 73 km, this gives an angle of about 0.3 - 0.35 degrees (0.35 would be for a line perpendicular to the line of sight, but it is slightly oblique in reality). Based on that, we can estimate the angular size of Schneeberg in the picture to be about 0.05 - 0.1 degrees.

The latest version of the refraction simulator has two interesting features: one, it can print the initial angle between the simulated ray and horizontal plane, and two, it can simulate a flat Earth. The light rays start at the observer, so this gives us an ability to calculate the viewing angle of Schneeberg and the hill near Protivanov both on a round and a flat Earth, and both with refraction and without. The results are presented below:

a) Round Earth, with refraction

`Hill:`

$ ./atm-refraction --start-h 1565 --tgt-h 680 --tgt-dist 73 --output-ang

-0.9565201819329879

Schneeberg:

$ ./atm-refraction --start-h 1565 --tgt-h 2070 --tgt-dist 277 --output-ang

-0.8812788180363719

```
```

`Difference: 0.075 degrees`

b) Round Earth, no refraction

`Hill:`

$ ./atm-refraction --start-h 1565 --tgt-h 680 --tgt-dist 73 --output-ang --straight

-1.0223415221033665

Schneeberg:

$ ./atm-refraction --start-h 1565 --tgt-h 2070 --tgt-dist 277 --output-ang --straight

-1.1397834768466832

```
```

`Difference: -0.117 degrees (invisible)`

c) Flat Earth, no refraction

`Hill:`

$ ./atm-refraction --start-h 1565 --tgt-h 680 --tgt-dist 73 --output-ang --straight --flat

-0.6945791903312372

Schneeberg:

$ ./atm-refraction --start-h 1565 --tgt-h 2070 --tgt-dist 277 --output-ang --straight --flat

0.10445608879994964

```
```

`Difference: 0.799 degrees`

d) Flat Earth, with refraction

`Hill:`

$ ./atm-refraction --start-h 1565 --tgt-h 680 --tgt-dist 73 --output-ang --flat

-0.6293267999337602

Schneeberg:

$ ./atm-refraction --start-h 1565 --tgt-h 2070 --tgt-dist 277 --output-ang --flat

0.3332380994146694

```
```

`Difference: 0.963 degrees`

As you can see, the predicted angular sizes of the visible part of Schneeberg vary wildly between models. One of the models fits perfectly, though... the round one, with refraction.

Even such a simple observation turns out to be a pretty solid proof for the roundness of the Earth!

]]>I've been employed at MaidSafe for over a year and a half now. It's a small, Scottish company working on creating a fully distributed Internet. Sounds a bit weird - the Internet is already distributed, isn't it? Well, it isn't completely - every website in the Internet exists on some servers belonging to some single company. All data in the Internet is controlled by the owners of the servers that host it, and not necessarily the actual owners of the data itself. This leads to situations in which our data is sometimes used in ways we don't like (GDPR, which came into force recently, is supposed to improve the state of affairs, but I wouldn't expect too much...).

MaidSafe aims to change all of this. It counters the centralised servers with the SAFE Network - a distributed network, in which everyone controls their data. When we upload a file to this network, we aren't putting it on a specific server. Instead, the file is sliced into multiple pieces, encrypted and distributed in multiple copies among the computers of the network's users. Every user shares a part of their hard drive, but only controls their own data - the rest is unreadable to them thanks to encryption. What's more, in order to prevent spam and incentivise the users to share their space, SAFE Network is going to have its own native cryptocurrency - Safecoin - but it won't be blockchain-based, unlike the other cryptocurrencies.

But enough advertising, let's get to the point.

The architecture of a distributed network gives rise to multiple challenges, not present in the traditional Internet. For example, let's assume that I have a file in the network, which is stored on some computers, and I want to modify it somehow. After I do that, I'm trying to read the file back from the network. How can I be sure that I'm seeing what I saved? There is no central authority - some computers storing my file might not have seen the update yet. Some might have disappeared from the network, there roles being taken over by other computers that weren't even the destination of the message carrying the update. Some computers might have been malicious and sending incorrect data on purpose - after all, the nodes of the network are regular users' computers, and we all know how popular trolling is. So, if the computers - the nodes of the network - are sending contradictory data, how do we know which version is the correct one?

The above is the so called distributed consensus problem, but not only that - the potential presence of malicious actors extends it to what is known as the "Byzantine generals problem" (being so named from the original formulation, in which the generals had to independently decide whether to attack a city). This problem is being widely analysed since the 80s, and there are multiple solutions for it - but it's still not the end! In our case, we want to be sure that the decision will be correct even if the messages being passed between computers are arbitrarily delayed. This introduces something called "asynchrony" - a lack of assumptions regarding timing. The problem defined this way is called ABFT - Asynchronous Byzantine Fault Tolerance.

There are solutions to ABFT as well - but they are either too slow, or too complex, or patented, or they have some other issues. This made us at MaidSafe decide to try and come up with our own algorithm, basing it on existing knowledge. PARSEC - a Protocol for Asynchronous, Reliable, Secure and Efficient Consensus - is the result of this effort.

PARSEC - contrary to its name - isn't fully asynchronous. In its current form, it contains some assumptions about delays in message delivery, which make it somewhat less sophisticated theoretically, but perhaps more practical. We are still looking for a way of getting full asynchrony, though.

I won't be getting into the technical details. For those interested, they are described here and here. In short, we combined the idea of a graph of "gossip" among the nodes with Asynchronous Binary Consensus (which is a type of consensus about a single value which can be either 0 or 1) using something called a "concrete coin" (roughly speaking, it is about computers being able to "toss a coin" independently and get 0 or 1 randomly, but so that they all get the same value with a large probability).

The whole field is very interesting and there is much left to be discovered there. For those interested in studying the issue a bit deeper - I encourage you to read the documents linked above, an article on Medium and to take a look at the SAFE Network community forums. I'll gladly answer any questions myself as well, so feel free to ask in the comments! :)

]]>A more detailed description of those GIFs will be a part of the new post in the category Physics for everyone :)

I published the code I used to generate them on GitHub: https://github.com/fizyk20/spacetime-graph/tree/blog-post

]]>