Package org.tartarus.snowball
Class SnowballProgram
java.lang.Object
org.tartarus.snowball.SnowballProgram
- Direct Known Subclasses:
ArabicStemmer,ArmenianStemmer,BasqueStemmer,CatalanStemmer,DanishStemmer,DutchStemmer,EnglishStemmer,EstonianStemmer,FinnishStemmer,FrenchStemmer,German2Stemmer,GermanStemmer,HungarianStemmer,IrishStemmer,ItalianStemmer,KpStemmer,LithuanianStemmer,LovinsStemmer,NorwegianStemmer,PorterStemmer,PortugueseStemmer,RomanianStemmer,RussianStemmer,SpanishStemmer,SwedishStemmer,TurkishStemmer
This is the rev 502 of the Snowball SVN trunk,
now located at GitHub,
but modified:
- made abstract and introduced abstract method stem to avoid expensive reflection in filter class.
- refactored StringBuffers to StringBuilder
- uses char[] as buffer instead of StringBuffer/StringBuilder
- eq_s,eq_s_b,insert,replace_s take CharSequence like eq_v and eq_v_b
- use MethodHandles and fix method visibility bug.
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected StringBuilderprotected voidcopy_from(SnowballProgram other) protected booleaneq_s(int s_size, CharSequence s) protected booleaneq_s_b(int s_size, CharSequence s) protected booleaneq_v(CharSequence s) protected booleanprotected intfind_among(Among[] v, int v_size) protected intfind_among_b(Among[] v, int v_size) Get the current string.char[]Get the current buffer containing the stem.intGet the valid length of the character array ingetCurrentBuffer().protected booleanin_grouping(char[] s, int min, int max) protected booleanin_grouping_b(char[] s, int min, int max) protected booleanin_range(int min, int max) protected booleanin_range_b(int min, int max) protected voidinsert(int c_bra, int c_ket, CharSequence s) protected booleanout_grouping(char[] s, int min, int max) protected booleanout_grouping_b(char[] s, int min, int max) protected booleanout_range(int min, int max) protected booleanout_range_b(int min, int max) protected intreplace_s(int c_bra, int c_ket, CharSequence s) voidsetCurrent(char[] text, int length) Set the current string.voidsetCurrent(String value) Set the current string.protected voidprotected voidprotected voidprotected StringBuilderabstract booleanstem()
-
Field Details
-
current
private char[] current -
cursor
protected int cursor -
limit
protected int limit -
limit_backward
protected int limit_backward -
bra
protected int bra -
ket
protected int ket
-
-
Constructor Details
-
SnowballProgram
protected SnowballProgram()
-
-
Method Details
-
stem
public abstract boolean stem() -
setCurrent
Set the current string. -
getCurrent
Get the current string. -
setCurrent
public void setCurrent(char[] text, int length) Set the current string.- Parameters:
text- character array containing inputlength- valid length of text.
-
getCurrentBuffer
public char[] getCurrentBuffer()Get the current buffer containing the stem.NOTE: this may be a reference to a different character array than the one originally provided with setCurrent, in the exceptional case that stemming produced a longer intermediate or result string.
It is necessary to use
getCurrentBufferLength()to determine the valid length of the returned buffer. For example, many words are stemmed simply by subtracting from the length to remove suffixes.- See Also:
-
getCurrentBufferLength
public int getCurrentBufferLength()Get the valid length of the character array ingetCurrentBuffer().- Returns:
- valid length of the array.
-
copy_from
-
in_grouping
protected boolean in_grouping(char[] s, int min, int max) -
in_grouping_b
protected boolean in_grouping_b(char[] s, int min, int max) -
out_grouping
protected boolean out_grouping(char[] s, int min, int max) -
out_grouping_b
protected boolean out_grouping_b(char[] s, int min, int max) -
in_range
protected boolean in_range(int min, int max) -
in_range_b
protected boolean in_range_b(int min, int max) -
out_range
protected boolean out_range(int min, int max) -
out_range_b
protected boolean out_range_b(int min, int max) -
eq_s
-
eq_s_b
-
eq_v
-
eq_v_b
-
find_among
-
find_among_b
-
replace_s
-
slice_check
protected void slice_check() -
slice_from
-
slice_del
protected void slice_del() -
insert
-
slice_to
-
assign_to
-