[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Perl of makenh
Can you tell me if there is a problem with this code out of makenh:
open(ROBOTFILE, $TEMPROBOTFILE); # assume it'll work
while(<ROBOTFILE>){
s/\#.*$//; # remove comments
if(/^User-agent:.*\W$ROBOTNAME\W/io ||
/^User-agent:\s*[*]/io){
# check for paths
print LOGFILE " Found reference to this robot in robot file\n";
while(<ROBOTFILE>){
if(/^Disallow:\s*(\S+)\s*(\#.*)?/){
print LOGFILE " Robot disallowed for $1\n";
push(@paths, $1);
}else{
last; # we're done with the record
}
}
}
}
Our robots.txt file looks like this:
User-agent: GPOHTTPGET
Disallow:
User-agent: *
Disallow: /
We set up our ROBOTNAME as GPOHTTPGET. It appears that we are reading that
correctly in the file, and getting the first Disallow: that should allow our
Robot to not disallow anything. But it also appears that we are getting the
second one (specified by the asterisk) and then disallowing everything.
The way I read the standard, this robots.txt file should allow GPOHTTPGET to
index the whole site while excluding everybody else.
I'm not too familiar with PERL and I don't want to try to fix the code if
it's not broken. But it doesn't seem to be working correctly.
(add or subscribe to mailing list)
Russell