Identify and prevent bots from accessing your site with IPinfo. Taking a closer look at hosting
type of IP to Privacy Detection data.
In our post, Getting IP data from anonymous IP addresses, we took a closer look at getting ASN and company data from anonymous IP addresses.
If you are familiar with our IP to Privacy API or data download services, you know that we identify the following anonymous IP address types:
- VPN
- Proxy
- Tor
- Relay
- Hosting
In this post, we are going to take a closer look at the hosting category, which represents IP addresses that are hosted in cloud provider platforms, data centers, and other hosting service platforms.
What do we mean by hosting
IP address? Is that bot?
Yes, usually, bots are hosted on hosting platforms such as data centers and cloud platforms. There are different types of bots, but they are typically defined as programs that access services connected to the internet, such as servers and websites, and repeatedly perform preprogrammed actions.
Different bots serve different purposes, but generally, a bot is built upon a data center server. Regular ISP-connected consumer devices cannot reliably and effectively support bots that perform large-scale programmable activities. Additionally, ISPs usually have very limited bandwidth allocation per consumer IP address. Therefore, for an effective bot, it needs to be hosted on a server.
So, by identifying hosting
services, we can reasonably identify bots.
What kind of information can we know from hosting
IP addresses
In our IP to Privacy Data, we identify hosting services with a boolean flag.
For example, 64.233.160.0. This IP address belongs to GCP (Google Cloud Platform – AS15169). This IP address can potentially host a bot service.
Code
curl https://ipinfo.io/64.233.160.0?token=$token | jq .privacy
Result
{
"vpn": false,
"proxy": false,
"tor": false,
"relay": false,
"hosting": true,
"service": ""
}
We have correctly identified it is a hosting service, and it could potentially be a bot service. To get more information, we can use the company and the ASN information as well.
curl https://ipinfo.io/64.233.160.0?token=$token | jq '{ "privacy": .privacy, "company": .company, "asn": .asn }'
{
"privacy": {
"vpn": false,
"proxy": false,
"tor": false,
"relay": false,
"hosting": true,
"service": ""
},
"company": {
"name": "Google LLC",
"domain": "google.com",
"type": "hosting"
},
"asn": {
"asn": "AS15169",
"name": "Google LLC",
"domain": "google.com",
"route": "64.233.160.0/24",
"type": "hosting"
}
}
Which database best complements our IP to Privacy Detection database?
An AS organization owns an IP address, but they can either operate the IP address themselves or allow other companies to use it. Therefore, the best database is the IP to Company database, which complements the IP to Privacy Detection database.
For example: 128.177.109.0
curl https://ipinfo.io/128.177.109.0?token=$token | jq '{ "privacy": .privacy, "company": .company, "asn": .asn }'
{
"privacy": {
"vpn": false,
"proxy": false,
"tor": false,
"relay": false,
"hosting": true,
"service": ""
},
"company": {
"name": "Google Inc.",
"domain": "google.com",
"type": "hosting"
},
"asn": {
"asn": "AS6461",
"name": "Zayo Bandwidth",
"domain": "zayo.com",
"route": "128.177.0.0/16",
"type": "hosting"
}
}
Google is operating this IP address. However, the IP address is owned by Zayo Bandwidth – AS6461.
Conclusion
Using this approach, you can easily identify bots hosted in cloud platforms like AWS, GCP etc. as well as hosting services like Hetzner, OVH etc. Then you can create firewall policies or reputation scores in your log management software based on the IP to Privacy data to prevent bots from accessing your site and server.
IPinfo services referenced:
API
Database
Free / Open access services
Signup for a free account today, and get access to the following services: