PostgreSQL: IP data lookup function

Our Snowflake listings package UDTF solutions facilitate an extremely fast IP lookup mechanism. These UDTFs consider the native functionalities offered by the Snowflake Database Platform.

It is possible to create similar UDTF solutions inside PostgreSQL that take advantage of PostgreSQL’s native functionalities. We will explore how to create these IP lookup UDF functions inside PostgreSQL.

Step 0: Go through the IPinfo-Postgres documentation

Before we begin, we highly recommend you review the PostgreSQL documentation we have prepared.

:link: Using IPinfo’s data downloads in PostgreSQL - Docs / Database Downloads - IPinfo Community

You are required to:

  • Install the IP4R extension
  • Creating the IP data table
  • Ingesting the database in IP data table
  • Creating the index
  • Understand the IP lookup query

All of these steps are covered in the documentation article.

The essential aspect to understand is the IP lookup query, which looks something like this:

SELECT
	l.ip,
	ipc.country,
	ipc.country_name,
	ipc.continent,
	ipc.continent_name
FROM log l
JOIN ip_country ipc
ON ipc.ip_range && l.ip::ipaddress; -- casting ip as ipaddress data type

Essentially, we will just convert the above query to a function.

Step 1: Create the lookup function.

Based on the IP lookup query, we will create the function using PL/pgSQL — SQL Procedural Language. PL/pgSQL is a procedural language supported by Postgres that extends the capabilities of standard SQL. This process includes creating functions.

For this example, I am using the free IP to Country database.

The function declaration statement looks like the following:

CREATE OR REPLACE FUNCTION ip_country(log_table TEXT, ip TEXT)
  RETURNS TABLE (input_ip TEXT, country TEXT, country_name TEXT, continent TEXT, continent_name TEXT)  AS $$
BEGIN
  RETURN QUERY EXECUTE '
  	SELECT l.' || ip || ', ipc.country, ipc.country_name, ipc.continent, ipc.continent_name
	FROM ' || log_table || ' l
	JOIN ip_country ipc
	ON ipc.ip_range && l.' || ip || '::ipaddress; -- casting ip as ipaddress data type
  ';
END;
$$ LANGUAGE plpgsql;

Query breakdown:

  • The function ip_country is created (or replaced) that takes the parameters log_table and ip as text inputs.
    • Please note, these two parameters are used not as variables but as text references to the IP tables.
    • This will point to the variable names you have set for your tables. log_table refers to the table that contains log data and ip is the column of the log_table referenced table containing IP addresses.
    • When using the function, you will see that we use strings and do not pass variables to the functions. The strings are references for the names of the tables inside your database.
  • RETURNS TABLE returns the table containing the IP addresses you passed, along with the IP metadata information available in the IPinfo database you refer to. Since we are utilizing the IP to Country database, the IP metadata information available in the database includes the following:
    • country
    • country_name
    • continent
    • continent_name
  • In the next section, we declare the function using PL/pgSQL. Essentially it templatizes the IP lookup query.

As this function takes references to lookup tables and columns as parameters, you only need to run the query once. You do not need to update this query whenever you ingest our data in Postgres.

Step 2: Using the function

Running the query is extremely simple:

SELECT *
FROM ip_country('log', 'ip');

Query breakdown:

  • 'log' → The table containing IP addresses you want to enrich. It is important to remember that this points to a table, CTE, or a view that already exists. This is why we use the single quotes (').
  • 'ip' → The column containing IP addresses. Once again, it is crucial to remember that this refers to a column of the referenced (not passed) table, CTE, or view from the previous step. To indicate that this is a reference to an existing entity, we use the single quotes (').

Output

The differences between the Snowflake UDTF and the PostgreSQL UDF are as follows:

  • We package the Snowflake UDTF ourselves, so you do not need to create the query.
  • There is no join declaration when using the query.
  • The Snowflake UDTF takes variables as function parameters, whereas in PG UDF, we pass text variables that refer to the actual variable inside PostgreSQL.

If you have any questions/feedback, please feel free to drop them below in the comment section.

should be
[postgrsql workflow]

Yes, I’m a huge contributor. :wink:

1 Like

(Without sarcasm) yes you are! Thank you very much. If you see ANY typos let me know, that will be greatly appreciated! Thanks!

1 Like