In the access area. Find the distance from a point to an area and reduce reverse geocoding requests

More than once I had to implement the functional of calculating the distance from a certain geographical point to an area on the map – for example, to the Moscow Ring Road. As a result, I found two ways to solve the problem that showed good results, and now we regularly use them in production. I will describe them in the first part of the article. And in the second I’ll show how you can cache geodata to use geocoder less.

Part one. Two ways to find the distance from a point to an area on a map

If your mobile application is truly mobile, it works with the coordinates of the device. The location of the user (and device) affects various business indicators of the application, such as delivery cost, complexity factor, etc.
Below I will show examples of implementation of algorithms in python using the scipy and shapely libraries.
For geocoding we use Google Maps. It suits both functionality and cost of use. At the time of this writing, Google allows you to make the first 20,000 geocoding requests per month for free.

Method 1. We calculate the route based on the vertices of the polygon

Suppose we need to find the distance from a point somewhere in the Moscow region to the Moscow Ring Road. A real path is needed, not a geometric one. Therefore, first we build a landfill from the exit points from the Moscow Ring Road, and they do not coincide with the tops of the outlines of the road on the map.

exit_coordinates: List[Tuple[float, float]]
latitude: float
longitude: float

To work with geometry, we use the library shapely. With its help it is easy to determine whether the point of interest to us is inside the polygon or not.

from shapely.geometry import Polygon, Point
polygon = Polygon(exit_coordinates)
polygon.contains(Point(latitude,longitude))

The task of finding the nearest objects to the starting point is quite typical in mathematics. Usually it is solved using a KD tree. So let’s do it. In python, the tree is implemented in the library scipy. We will find the nearest peaks from the exits from the Moscow Ring Road to our point.

from scipy.spatial import KDTree
kd_tree = KDTree(exits_coordinates)
dists, indexes = kd_tree.query((latitude, longitude), k=7)
nearest_coordinates = list()
for _, index in zip(dists, indexes):
		nearest_coordinates.append(exits_coordinates[index])

Number k – the number of peaks – should not be too small, because the nearest exit from the Moscow Ring Road may be temporarily blocked. Or it may be somewhere beyond the river and not have a direct path to our point. Too big k also useless for us, because a suitable exit is located in several nearby peaks. Let’s say there will be seven peaks.
Now let’s turn to the Google Maps service. It already has a functionality for finding distances from an array of points to a specific point – the Distance Matrix API.

import googlemaps
gmaps_client = googlemaps.Client(key=settings.GOOGLE_MAPS_API_KEY)
gmaps_client.distance_matrix(
    origins=nearest_coordinates,
    destinations=(latitude, longitude),
    mode='driving',
)

We can only parse the answer. Usually it has several routes, and the very first is not always the shortest. Therefore, we choose the appropriate value.

Method 2. We calculate the route from the center of the landfill

Now, in addition to the vertices of the polygon, we define some point inside it, from which we will build all the routes of Google Maps. We will use the Directions API, which will build routes for us, and choose the shortest of them.

start_point: Tuple[float, float],
destination: Tuple[float, float]
polygon: shapely.geometry.Polygon
gmaps_client = googlemaps.Client(key=settings.GOOGLE_MAPS_API_KEY)
driving_path = gmaps_client.directions(start_point, destination)

We start iteratively from the end to add the lengths of the path segments until the coordinate of the beginning of the segment is inside the polygon (parsing is omitted):

for step in reversed(driving_path):
    start_point = step['start_location']['lat'], step['start_location']['lng']
    if is_inside_polygon(start_point, self.polygon):
        end_point = step['end_location']['lat'], step['end_location']['lng']
        distance += get_part_outside_polygon(
            get_line(start_point, end_point), polygon
        ) * step['distance']['value']
        break
    distance += step['distance']['value']

We can only add part of the segment outside the polygon. To do this, using the shapely library, we build a geometric line between the coordinates of the beginning and end of the path and find how much of its length lies outside the polygon. The same percentage of the length is calculated for the obtained segment of the path.

A path is a terrain route on real roads with natural curvature. If it is a long straight line (avenue or highway), then our approximation will allow us to calculate the route exactly in percentage terms.
If the road crossing the landfill is curved enough, such an approximation will be incorrect. But the curved parts in the Google Maps route itself are usually short, and the errors in their calculation will not affect the result.

from shapely.geometry import LineString, Polygon
line = LineString(point1, point2)
part_outside_len = float(line.difference(polygon).length) / float(line.length)

To be honest, I did not compare these two methods. I have been using them for more than a year, both work without failures. I decided not to paint their implementation in detail. Instead, he opened access to his library for working with geo. Liba can also use regular geocoding, including with efficient caching. Issues and pull-requests are welcome!

Part two. Save Reverse Geocoding Requests

Often we need to determine the addresses of points that are nearby and correspond to one object in the real world. Each time, making a request to an external geocoding system is not cool, and here the question of caching reasonably arises.
Typically, the client sends the coordinates with excessive accuracy, for example, 59.80447691, 30.39570337. But how many signs in the fractional part will be significant for us?
For the lazy and in a hurry, the answer is four. For everyone else, see the explanation below.

First, a little geography.

  • The equator is 40,075.696 km long, and it is zero latitude.
  • Meridians are lines of longitude, perpendicular to lines of latitude. The length of any meridian is 40,008.55 km
  • Degree of latitude – 40,008.55 km / 360 = 111.134861111 km. So, one hundredth gives a change of about a kilometer, and one ten thousandth gives a change of 10 meters.
  • The circumference of the line of latitude decreases from the equator, and is multiplied by the cosine of the angle of latitude, therefore, for 60 degrees of latitude (latitude of St. Petersburg), the length is less than exactly 2 times less. Then the degree of longitude is 40,075.696 * 0.5 / 360 = 55.66066888 km, or one ten thousandth is 5 meters.

An error of 1/10000 degrees gives a rectangle of error of 5-10 by 10 meters. This will allow us to exactly “get” into the building, as the building is much larger than the point. And if, due to an error, the point does not fall into the building, Google will still determine the closest to it.
Rounding up to three characters is risky, accuracy is already significantly reduced. And up to five – it makes no sense, because the accuracy of GPS-transmitters does not exceed just a few meters.

Thus, if there is a value in the cache with the key “address: 59.8045,30.3957”, then all coordinates, when rounded to 4 characters, the same key is obtained, correspond to one geocoded address. We make fewer requests – we work faster and pay less for using the geocoding service. Profit!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *