Indexing MS Word and Excel Documents
last updated 3/9/06
To index a Microsoft Word document with Glimpse, you need
a word-to-ascii command line converter. Thanks to
Vitus Wagner there
is a program called catdoc
that does the job nicely.
NEW: recent versions of catdoc also include
the program xls2csv that allows on-the-fly coversions of Excel documents in the same way as MS Word.
Note, this is very new and may have some bugs. I got an error message, "No BOF record found" when trying to
convert an Excel document containing images.
- Download and install catdoc version 0.90 or later from
http://www.45.free.net/~vitus/ice/catdoc/
- Add the following (or similar) to the file .glimpse_filters
For MS-Word:
*.doc /usr/local/bin/catdoc <
*.DOC /usr/local/bin/catdoc <
For Excel:
*.xls /usr/local/bin/xls2csv <
*.XLS /usr/local/bin/xls2csv <
- Edit the wgreindex file in each
archive that needs to access non-ascii files. Change both glimpseindex command lines
to add the -z option, like so:
/bin/cat /home/WWW/proj/test/.wg_toindex | /usr/local/bin/glimpseindex -n -H /home/WWW
/proj/test -o -t -h -X -U -f -C -F -z > /dev/null
/bin/cat /home/WWW/proj/test/.wg_toindex | /usr/local/bin/glimpseindex -n -H /home/WWW
/proj/test -o -t -h -X -U -f -C -F -z
That should be it - we've tested it here, and it works for us...
Docs and Howtos
Questions: webglimpse-support@iwhome.com
Some other projects of general interest are
Tier Forum für Katzen & Hunde the page about pets.
Weblog Tommy - blog the web.
Gartenwerkzeug Werkzeug for the garden.
Laborwaage good weight.
Hochstuhl what a chair.
Vergleich private Haftpflichtversicherung ensure yourself.