By Tim Maninger, Features Writer
If you were shown a randomly chosen picture from anywhere in the world, how accurately could you tell where it was taken? Even for the most well-traveled people, it is a stretch to guess within a thousand miles. But what if we built a sophisticated piece of software to do the same task? Tobias Weyand, a computer vision specialist at Google, has done just this. He has used millions of location tagged images to calibrate a deep-learning machine to find the location where the image was taken, and millions more to test its ability. The results look promising for the future of machine learning.
The program, called PlaNet, was given 2.3 million location tagged images from Flickr to test its accuracy. It was able to localize 3.6% to street level, 10.1% to city level, 28.4% to country, and 48.0% to continent. At first glance these figures may not seem impressive, but many of these images were taken indoors, with very little in the way of clues to their location, and most were not identifiable landmarks.
To further test PlaNet, a competition was devised between the software and ten well-traveled humans. They played fifty rounds of a game called “geoguessr”. This game (www.geoguessr.com) places the player at a random Google Street View location and asks them to point to the location on a map. The closer your guess to your location, the higher your score. Amazingly, PlaNet won twenty-eight out of the fifty rounds and scored a median error of 1131.7km to the human’s 2320.8km. It is safe to say that PlaNet is performing its task at superhuman levels.
The software works by splitting the map of the world into thousands of rectangles of varying size, with a higher density in areas where more photos are taken. It then uses the information in the image to determine the most likely grid spaces that the image could have come from using other images it has seen from each space. Since this is a machine learning device, the more images it sees, the more accurate it will become. It has already seen more images than a human ever could. PlaNet can identify locations better than any human and, with a memory footprint of only 377MB, it could fit on almost any modern computing device.
The uses and the ethics behind the uses for this technology are still up for debate. This software could potentially help authorities track down targets or locate victims. It could also have privacy breaching effects, allowing anyone to put someone’s photos into this sophisticated software and figure out where the person lives, where they work, and where their kids go to school. New technologies always bring up new questions. These questions of ethics are often overlooked by the people who create the technologies and left to those who use them. Safeguards for privacy and security are often sacrificed for convenience. This all brings up the question of whose responsibility these issues should be.