Predicting Solar Eclipses with Python

April 8, 2024 Author of the article, founder and CEO of the company Modal LabsErik Bernhardsson planned to watch his first total solar eclipse. The day before, he had an idea – what if he tried to calculate the periodicity of this phenomenon in the past and future using Python? Despite minor difficulties with the coordinate system, the author managed to create a working solution in just a few hours.

Below you can read how, using ~100 lines of code, we were able to calculate and track the path of every solar eclipse from 2020 to 2030.

*Please note that the author's position may not always coincide with the opinion of MyOffice.


When I was planning to watch a solar eclipse, I was curious about how hard it would be to calculate its periodicity using Python. I didn't want to delve into astronomy to do the calculations, so I decided to use Python's wonderful ecosystem for everything. It turns out that in Astropy package has about 80% of what I need. In particular, the package allows you to calculate the position of the Sun and Moon in the sky quite easily.

After a few minutes of googling, I came up with a code that calculates the occultation of the Sun by the Moon for a given point on Earth:

from astropy.coordinates import EarthLocation, get_body
from astropy.time import Time
from astropy.units import deg, m

def sun_moon_separation(lat: float, lon: float, t: float) -> float:
    loc = EarthLocation(lat=lat * deg, lon=lon * deg, height=0 * m)
    time = Time(t, format="unix")
    moon = get_body("moon", time, loc)
    sun = get_body("sun", time, loc)

    sep = moon.separation(sun)
    return sep.deg

To do this, the code takes a latitude/longitude pair and a unix timestamp and calculates the angular distance between the Sun and the Moon. Basically, this just means the distance between the centers of objects in the sky as seen from Earth. If the angular distance is very close to zero, you get a solar eclipse.

But! My goal was not to calculate for given coordinates. I wanted to calculate the location of the total eclipse based on the timestamp (if there is one).

Ideally we need to have 3D coordinates of the Earth, Sun and Moon. Then we need to project a line between the Sun and Moon to see if that line passes through the Earth, and if so, we need to find the latitude and longitude of that intersection. This may be the “right” way to do it, and if I had the time, I'd reach into my half-forgotten geometry knowledge and do just that.

But I didn't have time. The eclipse is tomorrow, and I just wanted to calculate the coordinates in the least complicated way. We already have a tool that calculates close parameters, but we need to change things up a bit. We'll do this with the thing I always use in such cases: the direct search method (black box optimization).

Solving for coordinates using direct search method

We have a function that takes three values ​​(a timestamp, latitude, and longitude) and returns the angular distance between the Sun and the Moon. But let's try a related problem instead: given a timestamp, find the latitude and longitude that minimize the distance between the Sun and the Moon.

If the minimum distance is almost zero, it means that we have found a solar eclipse. In this case, the coordinate that minimizes the function is the center of the moon's shadow on Earth.

Minimizing such an arbitrary function is quite simple. The package is perfect for this scipy.optimizewhich has a bunch of proven functions that were probably implemented in Fortran 77 – you can check it yourself if you want. We don’t even have the gradient of the function, but that’s okay – Nelder-Mead to help you.

The best part is that we can treat this function as a complete “black box” and optimize it from the outside. True, this requires some computational effort, but personally I wouldn't worry about it.

Code to use scipy.optimize.minimizeto find the location of the eclipse in the end it will be like this:

def find_eclipse_location(dt: datetime) -> tuple[float, float] | None
    """Возвращает координаты полного затмения или `None`"""
    t = datetime.timestamp(dt)
    fun = lambda x: sun_moon_separation(x[0], x[1], t)

    ret = minimize(fun, bounds=[(-90, 90), (-180, 180)], x0=(0, 0))
    return ret.x if ret.fun < 1e-3 else None

In essence, we are tying time to sun_moon_separation and build a new function with two variables: latitude and longitude. And then iterate over this function (with given constraints) to find the minimum.

Almost everything works! But part of the problem was that I wasted two hours because of a stupid mistake with the signs in the latitudes and longitudes. But even after fixing it, I got weird distorted coordinates.

I Thinkthat this was due to false minima, since I believe that the opposite of one solution is another solution. Obviously, we should discard solutions when the Sun not visible. Two simple modifications – and everything will work:

  1. If the Sun or Moon is below the horizon, we will return some large number.

  2. Instead of using (0, 0) as the starting point, perform a simple grid search at several points on the Earth and pick the one where the distance between the Sun and Moon is the smallest. Then use that point as the starting point for optimization.

My final code For sun_moon_separation And find_eclipse_location ended up being only slightly more complex than what I described above. With these modifications, we got a function that reliably takes any timestamp and determines the latitude/longitude of a solar eclipse (if there is one).

Search all eclipses

So, let's find more eclipses! Specifically, let's figure out the path of each eclipse between 2020 and 2030. To do this, we'll need to iterate set timestamps.

Alas, the function find_eclipse_location works quite slowly.

What do we do? Let's add a couple more tricks:

  1. We do a rough search over the entire decade, checking only every hour. If we find an eclipse, we do a more detailed search and work our way minute by minute.

  2. Let's parallelize!!!

I used the framework Modalwhose tools let you take Python code and run it in the cloud. The platform is great for scaling compute-intensive functions.

Now we can find all eclipses of 2020-30. Add a simple decorator to find_eclipse_location and then we perform the mapping procedure. The resulting code looks like this:

def run():
    dt_a = datetime(2020, 1, 1, 0, 0, 0, tzinfo=timezone.utc)
    dt_b = datetime(2030, 1, 1, 0, 0, 0, tzinfo=timezone.utc)

    # Compute evenly spaced datetimes
    dt = dt_a
    dts = []
    while dt < dt_b:
        dts.append(dt)
        dt = dt + timedelta(seconds=3600)

    # Map over it using Modal!!!
    for tup in find_eclipse_location.map(dts):
        if tup is not None:
            print("Found eclipse at", tup

Plotting a graph

I'm leaving out some details in the code itselfbut bear with us. Now that we have all the paths, we can draw them. I used Basemap and pretty quickly got this:

from matplotlib import pyplot
from mpl_toolkits.basemap import Basemap

def plot_path(dts: list[datetime], lats: list[float], lons: list[float]):
    # Set up a world map                                                                                                                                                                              
    pyplot.figure(figsize=(6, 6))
    lat_0, lon_0 = lats[len(lats) // 2], lons[len(lons) // 2]
    bm = Basemap(projection="ortho", lat_0=lat_0, lon_0=lon_0)
    bm.drawmapboundary(fill_color="navy")
    bm.fillcontinents(color="forestgreen", lake_color="blue")
    bm.drawcoastlines()

    # Plot eclipse path
    x, y = bm(lons, lats)
    bm.plot(x, y, color="red")

I added a few more elements to my final script, including local time using timezonefinder to search for local time zones using latitude/longitude pairs.

Here's what the April 8, 2024 eclipse would look like if we plotted the data on a map using our script:

Fabulous!

In reality, this probably won't win you any “Best Designer” awards, but it's a pretty decent starting point – the point here isn't about beauty, it's about finding eclipses in ~100 lines of Python.

Which is exactly what the script does! In fact, it finds all eclipses in the period 2020-2030:

2020-06-21 over Africa, the Middle East and Asia

2020-12-14 over a tiny piece of South America

2021-06-10 over northern Canada and Greenland

2021-12-04 over Antarctica

2023-04-20 over Australia and Papua New Guinea

2023-10-14 over the USA, Central America and South America

2024-04-08 over Mexico, the USA and Canada

2024-10-02 over a small piece of South America (again?)

2026-02-17 over a tiny piece of Antarctica (will anyone see it?)

2026-08-12 over Greenland and Spain

2027-02-06 over a tiny piece of South America (and for the third time??)

2027-08-02 over North Africa and the Middle East

2028-01-26 over South America and Spain

2028-07-22 over Australia and New Zealand

My list is identical to what I found online, which is very encouraging.

The total calculation time is several minutes.

Of course, this is a crude approach. I'm sure NASA has a C++ version that runs 1000 times faster. But from a developer productivity standpoint, it's a win, even without the fact that we were also building maps!

Notes

  • Lucky people in southern South America catch three eclipses per decade.

  • If you want to check the code, here it is Here.

  • I was inspired with this postwhere for calculations similar to mine, the Mathematica system was used.

  • My code uses the following as a starting point: code from Stackoverflow.

  • I ignored the difference between annular and total eclipses in your code. I think it's easy to fix if needed.

  • I also did not calculate the width of the path of the total eclipse, that is, the width of the sun's shadow on Earth. Only the path of the center of this shadow.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *