[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Webglimpse Home]
PDF Indexing
Hello,
I have installed glimpse-4.12.6 and webglimpse-1.7.6edu.
I have them working well with html files. I followed the
directions to index PDF files (I tried both Ghostscript and
xpdf) but it doesn't seem to be indexing the pdf files correctly.
I have made the changes to wgreindex, .glimpse_filters, and the
.wgfilter-index files appropriately, but still my search does
not return anything when I search for text that is in the PDF
document. There are no errors during the index and it does tell
me that it gets the PDF files. However, I can search on the title
of the PDF document and it will return it, but it still is not right.
Here is the result of searching for "September", which is in the title
of the document:
http://axp105.ams.org/secretary/council-minutes0992.pdf, Apr 20 2000
99 0 obj << /CreationDate (D:19990310130217) /Producer (Acrobat Distiller
Command 3.01 for Solaris 2.3 and later\(SPARC\)) /Creator (dvips\(k\) 5.78
Copyright 1998 Radical Eye Software \(www.radicaleye.com\ \)) /Title (September
8
1992 Council Minutes) /ModDate (D:19990310131018) >> endobj xref 99 1 0000104947
00000 n trailer << /Size 100/Info 99 0 R /Root 98 0 R /Prev 102788
/ID[<932ef60fec635f1addbb873a7042389d><4734f53558fc99dc9e96a3a08303e6d5>] >>
startxref 105241 %%EOF
Should it matter that I am doing this on a Digital Unix machine? When I run the
commands to convert to text (usexpdf.pl or processpdf.pl), the text comes out
fine.
Any help would be appreciated.
--
Bob Morse
System Administrator
American Mathematical Society
401 455 4162