[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Webglimpse Home]
Re: ignoring stopwords?
Hi Dennis
Unfortunately, there is no actual 'stop list' file - glimpseindex decides
on the fly which words not to index, but does not save a list of them
anywhere.
Its a good idea, but I think you would have to implement the way you
already thought of (calls to glimpse to determine frequency), or do some
hacking in the glimpseindex code. If you try the latter let me know,
perhaps I can help some.
--Golda
At 04:52 PM 11/5/02 -0600, Daniel Mahler wrote:
>
>Hello again,
>
>Is there a way to make glimpse drop stopwrds from queries?
>ie to make "play;in;the;england" just act like "play;garden"
>and also "play,in,the,graden" as "play,graden".
>Or put another way, make stop words act like the identity
>of both logical operators
>[I do know there are theoretical problems with that]
>
>I am constructucting queries from input text
>and I like stop words treated as noise.
>However I want stop words to coincide with
>the statistically determined stopwords
>that glimpse generates rather then trying to construct a list.
>I could try using -N to test the frequency of each word first,
>but is there something more elegant?
>
>thanks
>
>D
>
>
>
------------------------------------------------------------
Golda Velez (use contact form) 626-792-9277
Internet Workshop http://iwhome.com
Webglimpse Search Software http://webglimpse.net
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Help organize the world - index your own corner of the web