Package smile.nlp.stemmer
Class LancasterStemmer
java.lang.Object
smile.nlp.stemmer.LancasterStemmer
The Paice/Husk Lancaster stemming algorithm. The stemmer is a conflation
 based iterative stemmer. The stemmer, although remaining efficient and
 easily implemented, is known to be very strong and aggressive. The stemmer
 utilizes a single table of rules, each of which may specify
 the removal or replacement of an ending. For details, see
 
References
- Paice, Another stemmer, SIGIR Forum, 24(3), 56-61, 1990.
 
- 
Constructor Summary
ConstructorsConstructorDescriptionConstructor with default rules.LancasterStemmer(boolean stripPrefix) Constructor with default rules.LancasterStemmer(InputStream customizedRules) Constructor with customized rules.LancasterStemmer(InputStream customizedRules, boolean stripPrefix) Constructor with customized rules. - 
Method Summary
 
- 
Constructor Details
- 
LancasterStemmer
public LancasterStemmer()Constructor with default rules. By default, the stemmer will not strip prefix from words. - 
LancasterStemmer
public LancasterStemmer(boolean stripPrefix) Constructor with default rules.- Parameters:
 stripPrefix- true if the stemmer will strip prefix such as kilo, micro, milli, intra, ultra, mega, nano, pico, pseudo.
 - 
LancasterStemmer
Constructor with customized rules. By default, the stemmer will not strip prefix from words.- Parameters:
 customizedRules- an input stream to read customized rules.- Throws:
 IOException- when fails to read the rule file.
 - 
LancasterStemmer
Constructor with customized rules.- Parameters:
 customizedRules- an input stream to read customized rules.stripPrefix- true if the stemmer will strip prefix such as kilo, micro, milli, intra, ultra, mega, nano, pico, pseudo.- Throws:
 IOException- when fails to read the rule file.
 
 - 
 - 
Method Details