Characteristics of question format web queries: an exploratory study
Analysis of large data logs: an application of Poisson sampling on excite web queries
Mining a Web Citation Database for author co-citation analysis
Integrated multi-strategic Web document pre-processing for sentence and word boundary detection
The use of bigrams to enhance text categorization
Efficient stemmer generation
The effectiveness of query-specific hierarchic clustering in information retrieval
A feature mining based approach for the classification of text documents into disjoint classes