[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Webglimpse Home]
Re: limits on glimpse's performance
No, but an index for a 5G corpus
builds in a few hours on 2GHz box
[indexing numbers and all words including stop words].
You will need to build the indexes from the commandline
using glimpseindex or hack your webglimpse scripts
to change the options passed to glimpseindex.
Key things I learned are:
* build a small (-o) index; tiny and medium were broken and/or flaky.
* use as as much memory for the build as you can (-M 512)
if you leave the default (-M 2) you will wait til the heat death of the universe.
* also use -B
* you will probably turbo files for -i -w queries (-T)
Dragomir R. Radev writes:
> Has anyone successfully used glimpse to index a 100G corpus?