public interface TermProcessor extends Serializable, FlyweightPrototype<TermProcessor>
Index contruction requires sometimes modifications of the given terms: downcasing, stemming, and so on. The same transformation must be applied to terms in a query. This interface provides a uniform way to perform arbitrary term transformations.
Index construction requires also term filtering:
processTerm(MutableString) may
return false, indicating that the term should not
be processed at all (e.g., because it is a stopword).
Additionally, the method processPrefix(MutableString) may
process analogously a prefix (used for prefix queries).
Implementation are encouraged to expose a singleton, when
possible, by means of the static factory method getInstance().
Warning: implementations of this class are not required
to be thread-safe, but they provide flyweight copies.
The copy() method is strengthened so to return a instance of this class.
This interface was originally suggested by Fabien Campagne.
| Modifier and Type | Method and Description |
|---|---|
TermProcessor |
copy() |
boolean |
processPrefix(MutableString prefix)
Processes the given prefix, leaving the result in the same mutable string.
|
boolean |
processTerm(MutableString term)
Processes the given term, leaving the result in the same mutable string.
|
boolean processTerm(MutableString term)
term - a mutable string containing the term to be processed,
or null.null and should be indexed, false otherwise.boolean processPrefix(MutableString prefix)
This method is not used during the indexing phase, but rather at query time. If the user wants to specify a prefix query, it is sometimes necessary to transform the prefix (e.g., DowncaseTermProcessor.processPrefix(MutableString) downcasing it).
It is of course unlikely that this method returns false, as it is usually not possible to foresee which are the prefixes of indexable words. In case no natural transformation applies, this method should leave its argument unchanged.
prefix - a mutable string containing a prefix to be processed,
or null.null and there might be an indexed
word starting with prefix, false otherwise.TermProcessor copy()
copy in interface FlyweightPrototype<TermProcessor>