Getting Started




Each section of this documentation contains multiple related sections. After reading this section, you should feel comfortable continuing through the other sections in any order. This type of organization will hopefully allow you to create a working Webglimpse search application quickly and then customize it at your own pace.

What is Webglimpse?

Webglimpse is a web indexing system which allows you to add search capabilities to your site. The software is comprised of several parts including a spider and manager, written in Perl, and Glimpse, the core indexing/search algorithm written in C. Glimpse,which stands for GLobal IMPlicit Search, is a popular UNIX indexing and query system that enables you to search through a large set of files very quickly. The web administration interface, remote link spider, and the powerful Glimpse file indexing and query system have been combined to make this a very powerful searching and indexing solution.

What Can Webglimpse Do?

Overview of Webglimpse Features

Ease of Use:

Web Administration Interface

Speed and Power:

Fast Searching

Large Data Sets:

Glimpse has been tested on data sets up to 9 Gigabytes. There is no built-in size limitation on the index

Result Caching/Large # of Hits:

Webglimpse maintains a cache of recent searches


Configurable rules maximum flexibility

Customized Results Output:

Using output templates

Customized Ranking of Hits:

Four built-in ranking schemes with META tag support

Index Local and Remote Pages:

Flexible rules for gathering pages from remote sites

Boolean Expressions, Wildcards, Misspellings & more:

Powerful agrep engine allowing partial or whole-word matches, complex boolean
combinations and regular expressions

HTML, PDF, Word, other formats:

any file type that can be converted to text can be indexed

Dynamically Generated Pages:

Allows both dynamically created and static pages in a single index

"Neighborhood" Searching:

For large sites with many different areas, it is possible to set up localized
searching of just the links from each page

Query Log:

Keywords entered are logged so you can see how users are actually
interacting with your site

Index Any Single-Byte Language:

Index any language that is single-byte encoded

Preset Results Output and Search Forms:

Search forms and result templates available in the following languages:
Arabic, Bulgarian, Estonian, English, German, Spanish, French, Hebrew,
Italian, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian and Finnish.

This list provides a high level overview of the numerous features of Webglimpse. A more comprehensive list of feature details including a comparison of the com versus edu versions can be found in the appendix.

Continue on to the Simple Tutorial to learn more.