Sponsored by the Advanced Research and Development Activity in Information Technology (ARDA) under its Statistical Language Modeling for Information Retrieval Research Program, the Lemur Project has recently announced the availability of the Lemur Toolkit for Language Modeling and Information Retrieval, version 1.0. The Lemur Toolkit is designed to help carry out research in areas such as ad hoc and distributed retrieval, cross-language IR, summarization, filtering, and classification. The toolkit supports indexing of large-scale text databases, the construction of simple language models for documents, queries, and more. The system, which is written in C and C++ languages, is designed as a research system to run under Unix operating systems, although it can also run under Windows. As part of the Lemur Project, the Lemur Toolkit is a collaboration between the Computer Science Department at the University of Massachusetts and the School of Computer Science at Carnegie Mellon University.
Comments