Ipinfo API vs Google BigQuery

Hi, I use ipinfo to get uses Country (one day maybe afford to get more data like city etc, but still too expensive for startup) I’ve had a look at Google’s BigQuery and got sample code working. The difference in time to get results are:

ipinfo API: 0.959 seconds

Google: 2.297 seconds

Both return the same limited data, but Google takes much longer. Too long to be viable to a website that will hopefully become very busy.

Hi,

Thank you for using IPinfo Lite on GCP. Do not worry about upgrading to our IP to location data right now. The exact same data you are using is currently being used global software applications, major OSS projects, and F500 companies.

You have asked a great question!

Comparing our API service with GCP BigQuery integration for website applications. The short answer is to use MMDB database.

In reality, when it comes to critical IP lookup time, you have to use the MMDB database. The MMDB database solution is going to be instantaneous. It is faster than the API, and of course faster than GCP.

So, why is MMDB faster than API and GCP?

MMDB is a binary database format designed for IP address databases. In the backend of our API service, we actually use the MMDB database as well.

In a traditional plain text IP address database, each row of data consists of IP address ranges represented like so:

Field Name Example Data Type Description
start_ip 1.0.16.0 TEXT Starting IP address of an IP address range
end_ip 1.0.31.255 TEXT Ending IP address of an IP address range
country JP TEXT ISO 3166 country code of the location

Meaning that when you have to look up an IP address, the algorithm will go through one row at a time and seeing if the IP address range is greater than the starting IP address or less or equal to the end IP address. If it matches, it will return the corresponding IP address metadata.

But in the binary implementation of MMDB, more dedicated, sophisticated algorithms are used, which makes this process faster.

However, the issue with non-MMDB IP addresses is that data platforms (such as Snowflake, GCP, PostgreSQL, etc.) often are not equipped to handle IP address lookup mechanisms similar to MMDB. So, when you look under the hood of our UDF, the implementation (though sophisticated and taking full advantage of what the platform could provide) is never going to be super fast compared to MMDB or API service.

So, the question becomes, considering there is a faster MMDB-based solution out there, why should you use the GCP, Snowflake, or any of our other database platform solutions?

The answer is bulk lookup, ease of use, and data migration. All your data in GCP can be moved to BigQuery. So, if you want to use MMDB-based solutions, you have to use a compute instance or something where you need to enrich the IP address first, then move the data to BigQuery.

Even though BigQuery’s lookup mechanism is comparatively slow, BigQuery is essentially where all of GCP’s data lives. So, once you bring your log data in, you can do threat intelligence and IP enrichment at bulk quite easily. Most of our users that use our GCP or Snowflake integration are not doing real-time enrichment or small batches of IP enrichment. They are processing millions of IPs in a batch. At those scales, GCP is incredibly faster than our API and is also not subjected to API rate limits.

So, if you want to do real-time enrichment for a website, use an List of MMDB reader libraries and integrate our IPinfo Lite MMDB database into your service. This, however, would involve you maintaining regular updates for the IPinfo Lite database, but that is not a significant challenge.


Please let me know what do you think. Cheers!

— Abdullah | DevRel, IPinfo

Connect with me: https://www.linkedin.com/in/reincoder/