Measuring geolocation accuracy with ground truth dataset

Abdullah · July 20, 2023, 3:26pm

Our ground truth dataset acts as a benchmark that we use to measure the accuracy of our IP geolocation data.

How do we determine our accuracy radius values?

If you have explored our custom IP to Geolocation Extended database, you have seen the “radius” column.

The radius column represents the accuracy radius of our IP geolocation data. This accuracy radius data shows the “statistical confidence” in the accuracy of the provided geolocation information.

The statistical confidence measure is established by comparing our computed IP address geolocation with a set of IP address geolocations that we are confident are absolutely accurate. These accurate IP address geolocation data act as the benchmark from which we establish the accuracy of individual IP address geolocation data.

We call these benchmark IP addresses with verifiable geolocation “ground truth data”.

If you are interested in learning more about statistical confidence calculation, check out the Confidence interval article on Wikipedia.

What is the ground truth dataset?

Ground truth data is IP addresses with verifiable geolocation data. These IP address geolocation data are derived from a number of different sources. These sources can be:

Crowdsourced GPS-backed or HTML5 geolocation data
Infrastructure router with location data
Geofeed data collected through RFC9092 or voluntary submissions.

How do we crowd-source ground truth information?

Most of the ground truth data is submitted voluntarily by users. This dataset is primarily crowdsourced. For example, if you visit our IP data pages for example 8.8.8.8 IP Address Details - IPinfo.io. You will sometimes be prompted with a pop-up that requests sharing some data information with us.

Kind users will submit their HTML5-based geolocation access, which helps us to keep our geolocation dataset accurate.

Do we include this ground truth data in our IP geolocation data?

No.

We merely use it to determine the accuracy of our geolocation data. In statistics and data science, this is called Statistical model validation. Even though we know the ground truth data is accurate, we don’t include this data in our geolocation data product. IP to Geolocation process is derived from our Probe Network backed data.

If you are interested in our IP to Geolocation extended database, please reach out to our sales team.

Topic		Replies	Views
The radius field in the IP to Geolocation extended database explained Database Downloads database , extended-database , ip-geolocation-exten	0	460	May 30, 2023
[Announcement] Our IP to Geolocation data got a major update Announcement database , ip-geolocation	2	1601	September 10, 2023
Announcing our new global geolocation accuracy page Announcement accuracy	0	235	February 21, 2024
IPinfo's ping-based geolocation provides more reliable IP location compared to WHOIS records Knowledgebase ip-geolocation , accuracy , probenet	0	454	September 28, 2023
Consensus does not equate to accuracy. Verify the IP location yourself General accuracy	2	154	February 19, 2024