We are about to switch to a new forum software. Until then we have removed the registration on this forum.
I have a .csv file with roughly 6k entries. They are cities with associated latitude and longitude.
I have a map on the screen that pops up when you run the sketch. Pixel coords <--> lat/long is working fine.
I want the user to be able to click on the map and to find the nearest city and print out its name.
I know how to do this in an inefficient way, such as this code/psuedocode (hope that's ok!)
float shortestDistance = 999;
for (int i = 0; i < data.length; i++) {
float testDistance = dist(data[i].lat, data[i].lon, pixelToMap(mouseX), pixelToMap(mouseY));
if (testDistance < shortestDistance) {
shortestDistance = testDistance;
shortestIndex = i;
}
}
println(data[shortestIndex].city);
... something like that anyway. I just feel like there is probably a smarter way to do this, and this is an area of programming that I am pretty unfamiliar with. Any help is greatly appreciated.
Answers
Pre-process the list for each element, and store the index of the nearest city with the city object?
data[i].nearestCity = indexOfOtherCity;
So, you don't really want a sort, but just find a minimum.
Your code is actually fine, since the data will change each time it is run.
Perhaps pre-compute the pixelToMap() calls (before the loop), to avoid additional computations.
NOTE: The algorithm I made up does not work for all cases after all as koogs pointed out, scroll down to see why. Didn't imagine that scenario when I thought it up.
Pick one of the dimensions, say x. Sort all of the points by x (this takes O(n log n) time but only has to be done once).
Use a binary search to find the point closest on the x-axis and determine its distance to your point. This will take O(log n) time to find the point in the sorted list.
Test the distance going left in the list until you find a point with a larger distance than the last one you tested. The last one you tested is the closest on the left side. This has to be done because sorting by x didn't give any information about y.
Do the same thing going right. Compare the distances of the "best" points on the left and right side (1 comparison).
In the worst case this would still take O(n) time like your example but it is extremely unlikely that this would happen unless it was set up for worst case on purpose.
Edit: After the initial sorting, the worst case for this would be 13 steps for the binary search (log base 2 of 6,000) + some number to the left + some number to the right + 1.
dist() does an expensive sqrt calculation, you can get rid of that by squaring both sides
THAT SAID, the original takes 2ms. 2 milliseconds. it's barely worth worrying about.
(i've seen this done with many more points where the space was divided up into a grid. then when the mouse was clicked you only needed to check the points in the square you clicked in and the eight surrounding squares, which cuts down the work considerably. won't work if the distribution is such that you have empty squares though - there's a chance of finding nothing at all)
asimes, does your algorithm work with the above situation? the red dot is closest to the black on the x axis. the blue is further away so, as i understand it, you'd choose the red dot as nearest. but the green dot is actually closer.
float testDistance = ((pos[i].x - x) * (pos[i].x - x)) + ((pos[i].y - y) * (pos[i].y - y));
Don't forget Processing got the sq() function! :D
This class Colour I've made uses both Arrays.sort() & Arrays.binarySearch()!
File "RGB.txt":
And program:
I go along with @PhiLho - all you need is to search for the minimum distance and your code does that.
It is a simple O(n) problem that for 6000 items is not worth the effort of trying to write sophisticated algorithms to save a few nanoseconds, especially since the code will only be executed on a mouse click and will the user notice if it takes an extra millisecond or so to show the name of the city, I think not.
@koogs, Ah I guess not then. I didn't do a real proof, just drew some dots on a piece of paper. I'll put a note up next to it
I guess I didn't realize it would be so quick. I only have a small dataset I made to test it, I keep forgetting to grab the big 6k entry database from home ;p
Thank you all so much for your help, I found every suggestion very useful and made the necessary changes! I'll let you know if it's slow for some reason.
One thing I have in mind--not that I'd implement it--but I'm curious how you might go about having the search wrap around the globe. For instance, if I click way off to the right of my map, which ends to the east of Asia, it won't recognize that Hawaii is the closest place, rather than Cairo. Thoughts?
THX!
It reminds me of an old post, not sure if its related though. http://forum.processing.org/one/topic/gradual-movement-within-a-for-loop.html#responseContentContainer_25080000002003953
If the longitudinal (east<>west) distance between the cities (d) > half the map width (mapwidth) then the distance in the other direction is shorter (mapwidth - d). I have also assumed that the x direction is longitude (east<>west) and y direction is latitude (north<>south). Your original post appears to have these swapped.