[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Webglimpse Home]

Re: Search results titles



> The extraction of the HTML title depends on the suffix and is
> hardcoded into the glimpseindex executable. One possible solution for
> you would be to add .pl as an HTML-like suffix. To do that you would
> need tomodify glimpse.h in the glimpse source tree and change 
> these lines from
> 
> #define EXTRACT_INFO_SUFFIX {".htm", ".html", ".shtm", ".shtml"}
> #define NUM_EXTRACT_INFO_SUFFIX	4
> 
> #define EXTRACT_INFO_SUFFIX {".htm", ".html", ".shtm", ".shtml", ".pl" }
> #define NUM_EXTRACT_INFO_SUFFIX	5


That should work if the URL ends in .pl, but for Andy's example

	http://server.domain/cgi-bin/program.pl?foo=bar&bar=foo

I'm not sure if it will.  He might have to make a hack to the extension-test section in filetype.c, the simplest would be to just comment out the test, like so

diff -r1.6.2.1 filetype.c
149,151c149,151
<               for (i=0; i<NUM_EXTRACT_INFO_SUFFIX; i++) {
<                       if (!strcasecmp(&tempname[name_len - strlen(extract_info_suffix[i])], extract_info_suffix[i])) break;
<               }
---
> #             for (i=0; i<NUM_EXTRACT_INFO_SUFFIX; i++) {
> #                     if (!strcasecmp(&tempname[name_len - strlen(extract_info_suffix[i])], extract_info_suffix[i])) break;
> #             }
153c153
<               if (i < NUM_EXTRACT_INFO_SUFFIX) {
---
> #             if (i < NUM_EXTRACT_INFO_SUFFIX) {
155c155
<               }
---
> #             }

This would just force glimpseindex to treat all files as HTML regardless of extension.

If we do look at a hack to webglimpse instead, I think the easiest would be to manipulate the temporary filenames.  Since a script has to be retrieved anyway, makenh could just store it in a file named N.html instead of N.program.pl?foo=bar&bar=foo as it currently does.

--Golda