[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ignoring commas, other punctuation
Hi Christian - agreed on both counts, I was just hoping to get away without
modifying the glimpse code too deeply. But yes, actually we need other
ways to choose word delimiters anyway to support some languages like Thai
(there, we need to look up the words in a dictionary to figure out where
they end). I'll see if I can figure out where it needs to be changed, any
hints are more than welcome! ;-)
--G
>
>In my opinion the Right Thing(TM) would be to extend glimpse such that you
>can specify arbitrary sets of characters as word delimiters. It is also
>likely to yield much faster searches than regular expressions.
>
>BTW, glimpse does not follow the POSIX standard for specifying regular
>expressions. According to POSIX, 'Christian*' means to look for 'Christia'
>followed by an arbitrary number of n's (including zero). In glimpse you
>need to specify this search as 'Christia(n)*'. This can be confusing to
>people who actually know regexps and would like to use them.
>
>- Christian
>
>
------------------------------------------------------------
Golda Velez gvelez@tucson.com 520-620-6878
Internet Workshop http://tucson.com
Webglimpse Search Software http://webglimpse.net
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Help organize the world - index your own corner of the web