Since TrustedHousesitters.com's inception, there have been a number of versions of the sitter and assignment search functionality. The first iteration allowed search within whole countries only. The second iteration, introduced in 2012, added a map view and allowed our users to tag their location using a marker. The third iteration, in 2013, brought in the Google Places API and allowed reverse geocoding.

In an effort to reduce our third party dependencies, and be able to have more control over search results, we took the decision to replace the Google Places API with our own internal Gazetteer (a dictionary of geographic place names). There are a couple of openly available data sets, but after reviewing we settled on GeoNames. This contains more than 8 million places and is licensed under Creative Commons Attribution 3.0.

So what are the benefits?

First of all, we are no longer dependent on an external service that could in theory be switched off. Additionally we can now perform far more powerful location based querying. One of the most common tasks is determining locations that are nearest to another location. We're primarily using MySQL to store our location data, the following are excerpts of the SQL we use to calculate distance from a given longitude/latitude point.

Let’s say you store coordinates in a table called cities in a POINT field location(longitude, latitude) and you need to find 5 closest cities that are at least 20kilometers away from London, UK. London has coordinates (long: -0.12574, lat: 51.50853).

With this simple query:

SELECT *, ( 6371 * acos( cos( radians(-0.12574) ) * cos( radians( X(location) ) ) * cos( radians( Y(location) ) - radians(51.50853) ) + sin( radians(-0.12574) ) * sin( radians( X(location) ) ) ) ) AS distance FROM cities_city WHERE ( 6371 * acos( cos( radians(-0.12574) ) * cos( radians( X(location) ) ) * cos( radians( Y(location) ) - radians(51.50853) ) + sin( radians(-0.12574) ) * sin( radians( X(location) ) ) ) ) > 20 ORDER BY distance ASC; LIMIT 0, 5;

6371 being earth's mean radius in km or if you prefer to work in miles, set the earth’s radius to 3959 and updated distance from 20km to 12.43miles.

This distance algorithm uses the Haversine formula to calculate the shortest distance between two points on the earth’s surface assuming the surface is flat. More details about the formula can be found in this wikipedia article.

The results of the above query are a list of the 5 closest cities to London which are at least 20km away (Woodford Green (20km), Richmond (20.2km), Edgware (20.3km), Brentford (20.45km) and Coulsdon (21km) )

This is just a small example of how you can use distance calculation to get extra data you can show to user while they search.

An alternative to the Haversine formula uses Minimum Bounding Rectangles (MBRs) to get cities that are in the spherical rectangle of (lat +/- 10km, lng +/- 10km) from a given location.

SELECT * FROM cities WHERE MBRContains ( LineString ( Point ( X(location) + 10 / ( 111.1111 / COS(RADIANS(Y(location)))), Y(location) + 10 / 111.1111 ), Point ( X(location) - 10 / ( 111.1111 / COS(RADIANS(Y(location)))), Y(location) - 10 / 111.1111 ) ), POINT(-0.12574, 51.50853) );

If you want use miles instead of kilometres, use 69.0 instead of 111.1111;

The constant 111.1111 is the number of kilometres per degree of latitude, based on the old Napoleonic definition of the metre as one ten-thousandth of the distance from the equator to the pole.