How to find IP owner?

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have a C# Web Application, and we are getting banged repeatedly by a web
crawler at a specific IP address. I did an internet lookup on the IP address
(http://www.ip2location.com), and it just says it is owned by Cox
Communications. I don't know if it belongs to a specific customer of Cox, or
if it belongs to Cox themselves. What is the normal procedure for finding
the owner of the IP? Do I call Cox and complain about abuse? Or is there a
better tool for finding out the actual customer that is running on the IP so
that I may contact them directly and find out why they are banging our site
so often? I know I can just block the IP address, but right now, I can't be
sure if it's one specific user on a dedicated IP, or if it's one person on a
shared network IP.
 
Brian said:
I have a C# Web Application, and we are getting banged repeatedly by a web
crawler at a specific IP address. I did an internet lookup on the IP address
(http://www.ip2location.com), and it just says it is owned by Cox
Communications. I don't know if it belongs to a specific customer of Cox, or
if it belongs to Cox themselves. What is the normal procedure for finding
the owner of the IP? Do I call Cox and complain about abuse? Or is there a
better tool for finding out the actual customer that is running on the IP so
that I may contact them directly and find out why they are banging our site
so often? I know I can just block the IP address, but right now, I can't be
sure if it's one specific user on a dedicated IP, or if it's one person on a
shared network IP.

it is probably a cable modem user.

collect his IP, dates and times of the abuse, block that IP from your
website, and contact cox abuse center.

http://www.cox.com/support/selectlocation_contact.asp
 
Brian said:
I have a C# Web Application

Sounds like the only thing in this that has to do with C# -- you might want
to consider some IIS related group for this type of query.
We are getting banged repeatedly by
a web crawler at a specific IP address.

How often?
Do I call Cox and complain about abuse?

If you must.
Or is there a better tool for finding out the actual
customer that is running on the IP
No.

I know I can just block the IP address, but right now, I
can't be sure if it's one specific user on a dedicated IP,
or if it's one person on a shared network IP.

If I were you, the first question I'd ask myself would be "Is this a
problem?", where problem is defined as substantial increase in my bandwidth
use or server load, or perhaps a valid security concern.

Depending on what exactly is being transfered, what seems often at first may
actually amount to nothing significant at all. For example, requesting a
page once a minute seems often, but if the response happens to be 2 KB of
HTML (note that if this is a web crawler or another automated process, it
might not be loading images even if you have them), that amounts to 2 KB *
60 * 24 * 30 = ~86.5 MB / month, which is practically nothing.

There's one valid reason for doing this, which comes to mind -- does your
web application present any data somebody might want to automatically scrub
for use somewhere else? If yes, is this necessarily bad? Assume first it's
not for re-publishing (there are copyright laws for that).
 
The problem is that my site has a lot of links that do searches against
databases and vendor databases. Some of these searches we pay a fee for
(small, but a fee nonetheless). When these crawlers hit our site, they also
invoke every search link on our site. There are maybe 50 or 60 links that
invoke searches.
 
Brian Kitt said:
The problem is that my site has a lot of links that do searches against
databases and vendor databases. Some of these searches we pay a fee for
(small, but a fee nonetheless). When these crawlers hit our site, they also
invoke every search link on our site. There are maybe 50 or 60 links that
invoke searches.

A robots.txt file is definitely what you want then - just disable the
crawlers from following those links, and it should be fine.
 
A robots.txt file isn't necessarily going to solve the problem.
Crawlers from respectable companies will respect the robots.txt file, but
it's not those guys you have to worry about.
 
Brian,

Since you know the specific IP that the request is coming from, you can
block the IP address. Here is a link to an article which explains how to do
it:

http://www.15seconds.com/issue/011227.htm

If they truly want to crawl your site, they will end up contacting you
and making themselves known.
 
Nicholas Paldino said:
A robots.txt file isn't necessarily going to solve the problem.
Crawlers from respectable companies will respect the robots.txt file, but
it's not those guys you have to worry about.

It depends - if the site is set up in a way which would make
respectable crawlers hit the "wrong" links, a robots.txt file *would*
sort it out. It's definitely the first thing to try. It that fails,
*then* it's worth going in for IP blocking etc.
 
Jon Skeet said:
It depends - if the site is set up in a way which would make
respectable crawlers hit the "wrong" links, a robots.txt file *would*
sort it out. It's definitely the first thing to try. It that fails,
*then* it's worth going in for IP blocking etc.

There is also the problem of dynamic pages that change content based on
the link clicked on. I have some pages I want googled and some I don't,
so I also check the browser string for things like "bot". One page is
used so infrequently that I send an email to myself if someone shows up
there, and generally it turns out to be a bot, which I add to the list
and redirect.
 
Back
Top