[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
testfix for Bug 112, "./././" problem
Hi folks
There is a nasty bit of recursion involving links of the form
<A HREF="./myself.html">. Maqsood and I came up with fix for it, basically
we insert the line
$url =~ s/(^|\/)(\.(\/|$))+/$1/g;
before each time an entry is added to %URL2FILE in makenh. In the official
release we'll try to do it a bit cleaner, in a single place in the code;
but in any case, if you're having the problem, please give it a try:
================================================================
Index: makenh
===================================================================
RCS file: /disk2/cvs/webglimpse/makenh,v
retrieving revision 1.18
retrieving revision 1.19
diff -c -r1.18 -r1.19
*** makenh 1999/11/18 23:41:33 1.18
--- makenh 1999/11/24 15:50:52 1.19
***************
*** 308,313 ****
--- 308,316 ----
# $file = &siteconf::LocalUrl2File($url);
# Now we check first if the url is local or remote. --GB 7/27/98
+ #Clean out any recursion in the path
+ $url =~ s/(^|\/)(\.(\/|$))+/$1/g;
+
if (CheckServer($url) == $URL_LOCAL) {
# just get the local file name
$file = &siteconf::LocalUrl2File($url);
***************
*** 998,1003 ****
--- 1001,1009 ----
# store $url and all redirects to map
foreach $url(@aliases){
+ #Clean out any recursion in the path
+ $url =~ s/(^|\/)(\.(\/|$))+/$1/g;
+
$URL2FILE{$url} = $file;
}
***************
*** 1074,1079 ****
--- 1080,1088 ----
$link = $guess;
$link =~ s/^\.\//$url/;
}
+ #Clean out any recursion in the path
+ $url =~ s/(^|\/)(\.(\/|$))+/$1/g;
+
$noindex = 1; # Default: index it.
***************
*** 1293,1298 ****
--- 1302,1310 ----
my ($url, $file, $numhops) = @_;
my (@thelist);
+ #Clean out any recursion in the path
+ $url =~ s/(^|\/)(\.(\/|$))+/$1/g;
+
push(@thelist, $url);
### MDSMITH -- added check for local_limit
***************
*** 1335,1340 ****
--- 1347,1353 ----
my($filename,$link_site, $url_site); # linksasfiles was unused
--GB 7/5/98
foreach $url (@urllist) {
+
$file = $URL2FILE{$url};
# print "Looking at url: $url, file: $file\n";
***************
*** 1363,1368 ****
--- 1376,1384 ----
if(($link eq "1") || ($link eq " ")) {
next;
}
+
+ #Clean out any recursion in the path
+ $link =~ s/(^|\/)(\.(\/|$))+/$1/g;
# first, check if it's excluded
$noindex=1; # by default, it's accepted
(the changes to siteconf.pl are probably not necessary, but I could
possibly think of some weird cases where a file alias introduces the "./"
business again)
------------------------------------------------------------
Golda Velez gvelez@tucson.com 520-620-6878
Internet Workshop http://tucson.com
Webglimpse Search Software http://webglimpse.net
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Help organize the world - index your own corner of the web