IP addresses that visit non-existing URL paths: A Look into Tags and Our Privacy Data

Daniel Cuthbert is working on an AI SOC project. The project identifies patterns that are somewhat unique to malicious traffic. The project uses IPinfo’s data, which intrigued me.

I shared this idea in the comment there:

Another signal worth considering is failed requests to non-existent or unexpected URLs (404/405/403 patterns). In practice, attackers often perform URL dictionary traversal or endpoint enumeration to discover unprotected routes, legacy admin paths, or misconfigured APIs.

Now, a little bit of context about this. I am familiar with API reverse engineering; this technique is borderline a very fundamental reconnaissance technique.

I have a small website hosted using NGINX, and I thought I would investigate the IP addresses responsible for 404 errors.

awk '$9 == 404 {print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr 

:link: IP Summarization Results of 103 IPs | IPinfo.io

You can summarize IP addresses using our data here: IP Summarization & Data Visualization | IPinfo.io

Of the 103 IP addresses, only about 13 addresses are tagged non-anonymous (based on our privacy detection and residential proxy data data).

Let’s take a look at the non-anonymous IP address.

Portscan Tag

Introducing the Portscan Tag

I found a Censys IP address there that we do not label as anonymous. Although we know that these IP addresses do perform port scans with non-malicious intent. It is not an abusive IP address according to our other partners’ abuseIPDB.

:link: 167.94.146.54 | Frankfurt am Main, AS398705, & VPN Not Detected - IPinfo.io

Although the Shadow Server Project runs a global honeypot, it conducts innocuous port scans. Unfortunately, this IP address is tagged as an abusive IP address.

:link: 74.82.47.3 | San Jose, AS6939, & VPN Not Detected - IPinfo.io

Crawler Tag

I also found a few Apple IP addresses there. Although not a hosting type, we do tag it as a crawler.

:link: 17.241.75.230 | Seattle, AS714, & VPN Not Detected - IPinfo.io

The same case applies to several Facebook IP addresses as well.

:link: 69.63.184.9 | Social Circle, AS32934, & VPN Not Detected - IPinfo.io

No Portscan/Crawler tags and malicious

These patterns are extremely suspicious and actually verge on being malicious. The website I am hosting does not use PHP. You can identify malicious traffic like this by its URL patterns.

:link: 91.232.238.112 | Vradiyivka, AS198253, & VPN Not Detected - IPinfo.io

91.232.238.112 /admin/config.php

Although we identify a BitTorrent client, our honeypot data does not indicate any other patterns of intrusions.

:link: 150.228.3.22 | Zagreb, AS14593, & VPN Not Detected - IPinfo.io

150.228.3.22 /xmlrpc.php

No tags and non-malicious

I think these are related to server side issues and not suspicious traffic

$ awk '{print $1, $7}' /var/log/nginx/access.log

125.20.185.10 /
125.20.185.10 /
154.70.82.114 /
154.70.82.114 /

:link: 125.20.185.10 | Jandiāla Gurū, AS9498, & VPN Not Detected - IPinfo.io

:link: 154.70.82.114 | Lomé, AS30982, & VPN Not Detected - IPinfo.io


Fun experiment, but a simple 404 hit does not indicate malicious intent. Nothing super conclusive. I think if you map the URL patterns, that could be interesting :thinking:

If you just buy our privacy data, you are simply 76% of the suspicious traffic out of the gate. By incorporating tags, data, thorough SOC analysis, and threat intelligence, you can effectively block the majority of suspicious traffic.

Looks fun