[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: replacing the monster regex
>>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<<
On 1/29/02, 4:13:42 PM, Golda Velez <(use contact form)> wrote regarding
Re: replacing the monster regex:
> At 04:18 PM 1/29/02 GMT, Derek Pomery wrote:
> >However, when you have a path like:
> >http://thiserver.thislongveryverylongdomainname.net/project/subproject/s
> >ubsubproject/somefurtherdivision/andanother/oneortwo/forfurtherorganizat
> >ion/A ridiculously long file name that describes exactly what the use
> >case is.html
> >
> >I was finding it was unsurprisingly breaking the splits || regex.
> >Hadn't gone back to see what the actual value for the substr was, since
> >everything worked fine without it. :)
> >
> Must have been a monster name indeed - the default limit to the substr
was
> 10000 chars!
> But, this is still a good point, and if it runs fast enough without the
> substr we'll just drop it.
Ah. Took the time to look around, and found this line:
$QS_maxchars = '500'; # Maximum number of characters to print (in case
file has no line breaks)
Chalk it up to my foolishness probably, long ago, and not your substr()
:)
That was probably due to someone complaining about binary files like
MSWord docs putting gibberish in the output, so I trimmed 10,000 to 500
without looking at the side effects.
Possibly doing the substr() *AFTER* having extracted the file name part
(that part of the line divided by FILE_END_MARK as opposed to ":") might
make more sense.
Filtering the file contents through the same filters wgindex uses might
be nice too, if not too slow.
> The code I sent you expects to be dropped into a new version of makenh
that
> does the %20 to ' ' replacements and a bunch of other stuff as well.
I'll
> try to get it off to you tonight so you don't have to spend time
> integrating the fragment I sent into the old version.
Thanks. I only need the CGIs and their libraries I think, but can't
hurt to update everything. (granted, I'll have to be careful I don't
blow away all the minor tweaking I've done :) )