Measuring geolocation accuracy with ground truth dataset

Our ground truth dataset acts as a benchmark that we use to measure the accuracy of our IP geolocation data.

How do we determine our accuracy radius values?

If you have explored our custom IP to Geolocation Extended database, you have seen the “radius” column.

The radius column represents the accuracy radius of our IP geolocation data. This accuracy radius data shows the “statistical confidence” in the accuracy of the provided geolocation information.

The statistical confidence measure is established by comparing our computed IP address geolocation with a set of IP address geolocations that we are confident are absolutely accurate. These accurate IP address geolocation data act as the benchmark from which we establish the accuracy of individual IP address geolocation data.

We call these benchmark IP addresses with verifiable geolocation “ground truth data”.

What is the ground truth dataset?

Ground truth data is IP addresses with verifiable geolocation data. These IP address geolocation data are derived from a number of different sources. These sources can be:

How do we crowd-source ground truth information?

Most of the ground truth data is submitted voluntarily by users. This dataset is primarily crowdsourced. For example, if you visit our IP data pages for example IP Address Details - You will sometimes be prompted with a pop-up that requests sharing some data information with us.

Kind users will submit their HTML5-based geolocation access, which helps us to keep our geolocation dataset accurate.

Do we include this ground truth data in our IP geolocation data?


We merely use it to determine the accuracy of our geolocation data. In statistics and data science, this is called Statistical model validation. Even though we know the ground truth data is accurate, we don’t include this data in our geolocation data product. IP to Geolocation process is derived from our Probe Network backed data.

