TPEMatcher: A tool for searching in parsed text corpora

作者:

Highlights:

摘要

Recently, due to the widespread on-line availability of syntactically annotated text corpora, some automated tools for searching in such text corpora have gained great attention. Generally, those conventional corpus search tools use a decomposition-matching-merging method based on relational predicates for matching a tree pattern query to the desired parts of text corpora. Thus, their query formulation and expressivity are often complicated due to poorly understood query formalisms, and their searching tasks may require a big computational overhead due to a large number of repeated trials of matching tree patterns. To overcome these difficulties, we present TPEMatcher, a tool for searching in parsed text corpora. TPEMatcher provides not only an efficient way of query formulation and searching but also a good query expressivity based on concise syntax and semantics of tree pattern query. We also demonstrate that TPEMatcher can be effectively used for a text mining in practice with its useful interface providing in-depth details of search results.

论文关键词:Corpus search tool,Tree pattern querying,Tree pattern matching,Parsed text corpora,Text mining

论文评审过程:Received 7 October 2010, Revised 12 April 2011, Accepted 12 April 2011, Available online 19 April 2011.

论文官网地址:https://doi.org/10.1016/j.knosys.2011.04.009