Using our database in PostgreSQL involves the following steps:
- Using the IP4R extension for IP address lookups.
- Create a sample
log
table. - Create the table for the IPinfo data ingestion and ingest the data there.
- Running the IP query operation.
Step 1 - Using the IP4R extension
To get started, you need to use the IP4R extension.
CREATE EXTENSION ip4r;
PostgreSQL: Using and installing the IP4R extension
PostgreSQL: Why use the IP4R extension and not Postgres’ network address types?
Step 2 - Create a sample log
table
We will create a sample log table for the demo and populate it with random IPv4 addresses. The data type of the IP
column is set as text
.
The
log
table consists of random IP addresses. Some of those IP addresses are bogon and will not have location data attached to them.
On psql
, we are running:
CREATE TABLE log (
id bigserial primary key,
ip text
);
To populate the log
table with 100000 random IP addresses, run the following:
INSERT INTO log(ip)
SELECT CONCAT(
TRUNC(RANDOM() * 250 + 2),'.',
TRUNC(RANDOM() * 250 + 2),'.',
TRUNC(RANDOM() * 250 + 2),'.',
TRUNC(RANDOM() * 250 + 2)
)::text
FROM GENERATE_SERIES(0, 99999);
Step 3 - Create the table for the IPinfo data
We declare the IP address columns (start_ip
, end_ip
and join_key
) as ipaddress
data type that comes from the IP4R extension. We generate an ip_range
column as iprange
format from the start_ip
and end_ip
columns.
I am using the IP geolocation database as an example for this guide.
-- Create the table for the IP to Geolocation data
CREATE TABLE ip_location (
start_ip ipaddress,
end_ip ipaddress,
join_key ipaddress,
ip_range iprange generated always as (iprange(start_ip, end_ip)) stored,
city text,
region text,
country text,
latitude text,
longitude text,
postal_code text,
timezone text
);
The database schema for the data downloads is available here: Database Types - IPinfo.io
If you are using the extended databases, you can find the schema for them here: GitHub - ipinfo/sample-database: IPinfo database and API sample repository
You can check out this post where we have CREATE TABLE
queries for all our databases: Postgres `CREATE TABLE` queries for IPinfo data downloads
Step 4 - Ingesting the data download in the Postgres table
You can use the following bash command to:
- Download IP Geolocation database in the zipped CSV format
- Unzip to get the CSV data
- Ingest the CSV data into the
IP_LOCATION
table via PSQL
Please set the PostgreSQL connection string here:
postgresql://username:password@database:5432/postgres
Please set up the IPinfo token here:
<YOUR_TOKEN>
curl -sL https://ipinfo.io/data/location.csv.gz?token=<YOUR_TOKEN> 2>&1 | gzip -d | psql postgresql://username:password@database:5432/postgres -c "COPY ip_location FROM STDIN DELIMITER ',' CSV HEADER"
Step 5 - Running the IP query operation
We can create the index for the IP_LOCATION
table. Then we can update the log
table with the location data. On our lookup operation, we are casting the data types of IP addresses in the log as ipaddress
data type from the IP4R
extension.
-- Setting up the index for the IP_LOCATION table
-- This might take a while
CREATE INDEX ipl_ip_range
ON ip_location
USING gist (ip_range);
-- Create a joined table from the IP addresses from the log table
-- with the IP to Location insights from the IP_LOCATION table
SELECT *
FROM log l
JOIN ip_location ipl
ON ipl.ip_range && l.ip::ipaddress; -- casting ip as ipaddress data type
This will output a joined data table containing the log’s IP addresses and their corresponding IP location data from the IP to Location database.
If for some reason the query operation is taking more than a few seconds, try running the command:
SET enable_seqscan = OFF;
Postgres doc
Optional Step - Outputting only the relevant IP data columns from the IP query operation
To clean up the output table a bit, call the individual data columns instead of running the SELECT
statement with *
.
SELECT
l.ip,
ipl.city,
ipl.region,
ipl.country,
ipl.latitude,
ipl.longitude,
ipl.postal_code,
ipl.timezone
FROM log l
JOIN ip_location ipl
ON ipl.ip_range && l.ip::ipaddress;