Our Snowflake listings package UDTF solutions facilitate an extremely fast IP lookup mechanism. These UDTFs consider the native functionalities offered by the Snowflake Database Platform.
It is possible to create similar UDTF solutions inside PostgreSQL that take advantage of PostgreSQL’s native functionalities. We will explore how to create these IP lookup UDF functions inside PostgreSQL.
Step 0: Go through the IPinfo-Postgres documentation
Before we begin, we highly recommend you review the PostgreSQL documentation we have prepared.
Using IPinfo’s data downloads in PostgreSQL - Docs / Database Downloads - IPinfo Community
You are required to:
- Install the IP4R extension
- Creating the IP data table
- Ingesting the database in IP data table
- Creating the index
- Understand the IP lookup query
All of these steps are covered in the documentation article.
The essential aspect to understand is the IP lookup query, which looks something like this:
SELECT
l.ip,
ipc.country,
ipc.country_name,
ipc.continent,
ipc.continent_name
FROM log l
JOIN ip_country ipc
ON ipc.ip_range && l.ip::ipaddress; -- casting ip as ipaddress data type
Essentially, we will just convert the above query to a function.
Step 1: Create the lookup function.
Based on the IP lookup query, we will create the function using PL/pgSQL — SQL Procedural Language. PL/pgSQL is a procedural language supported by Postgres that extends the capabilities of standard SQL. This process includes creating functions.
For this example, I am using the free IP to Country database.
The function declaration statement looks like the following:
CREATE OR REPLACE FUNCTION ip_country(log_table TEXT, ip TEXT)
RETURNS TABLE (input_ip TEXT, country TEXT, country_name TEXT, continent TEXT, continent_name TEXT) AS $$
BEGIN
RETURN QUERY EXECUTE '
SELECT l.' || ip || ', ipc.country, ipc.country_name, ipc.continent, ipc.continent_name
FROM ' || log_table || ' l
JOIN ip_country ipc
ON ipc.ip_range && l.' || ip || '::ipaddress; -- casting ip as ipaddress data type
';
END;
$$ LANGUAGE plpgsql;
Query breakdown:
- The function
ip_country
is created (or replaced) that takes the parameterslog_table
andip
as text inputs.- Please note, these two parameters are used not as variables but as text references to the IP tables.
- This will point to the variable names you have set for your tables.
log_table
refers to the table that contains log data andip
is the column of thelog_table
referenced table containing IP addresses. - When using the function, you will see that we use strings and do not pass variables to the functions. The strings are references for the names of the tables inside your database.
RETURNS TABLE
returns the table containing the IP addresses you passed, along with the IP metadata information available in the IPinfo database you refer to. Since we are utilizing the IP to Country database, the IP metadata information available in the database includes the following:country
country_name
continent
continent_name
- In the next section, we declare the function using PL/pgSQL. Essentially it templatizes the IP lookup query.
- One thing to note is
JOIN ip_country ipc
. When creating the IP data table at the beginning, we named the IP to Country database table asip_country
. Now, if you are using any other IP data table, you can name it accordingly (IP to Geolocation database →ip_location
, ASN database →ip_asn
, IP to Privacy Detection Database →ip_privacy
etc.).
- One thing to note is
As this function takes references to lookup tables and columns as parameters, you only need to run the query once. You do not need to update this query whenever you ingest our data in Postgres.
Step 2: Using the function
Running the query is extremely simple:
SELECT *
FROM ip_country('log', 'ip');
Query breakdown:
'log'
→ The table containing IP addresses you want to enrich. It is important to remember that this points to a table, CTE, or a view that already exists. This is why we use the single quotes ('
).'ip'
→ The column containing IP addresses. Once again, it is crucial to remember that this refers to a column of the referenced (not passed) table, CTE, or view from the previous step. To indicate that this is a reference to an existing entity, we use the single quotes ('
).
Output
The differences between the Snowflake UDTF and the PostgreSQL UDF are as follows:
- We package the Snowflake UDTF ourselves, so you do not need to create the query.
- There is no
join
declaration when using the query. - The Snowflake UDTF takes variables as function parameters, whereas in PG UDF, we pass text variables that refer to the actual variable inside PostgreSQL.
If you have any questions/feedback, please feel free to drop them below in the comment section.