Class SnowballProgram

java.lang.Object
org.tartarus.snowball.SnowballProgram
Direct Known Subclasses:
ArabicStemmer, ArmenianStemmer, BasqueStemmer, CatalanStemmer, DanishStemmer, DutchStemmer, EnglishStemmer, EstonianStemmer, FinnishStemmer, FrenchStemmer, German2Stemmer, GermanStemmer, HungarianStemmer, IrishStemmer, ItalianStemmer, KpStemmer, LithuanianStemmer, LovinsStemmer, NorwegianStemmer, PorterStemmer, PortugueseStemmer, RomanianStemmer, RussianStemmer, SpanishStemmer, SwedishStemmer, TurkishStemmer

public abstract class SnowballProgram extends Object
This is the rev 502 of the Snowball SVN trunk, now located at GitHub, but modified:
  • made abstract and introduced abstract method stem to avoid expensive reflection in filter class.
  • refactored StringBuffers to StringBuilder
  • uses char[] as buffer instead of StringBuffer/StringBuilder
  • eq_s,eq_s_b,insert,replace_s take CharSequence like eq_v and eq_v_b
  • use MethodHandles and fix method visibility bug.
  • Field Details

    • current

      private char[] current
    • cursor

      protected int cursor
    • limit

      protected int limit
    • limit_backward

      protected int limit_backward
    • bra

      protected int bra
    • ket

      protected int ket
  • Constructor Details

    • SnowballProgram

      protected SnowballProgram()
  • Method Details

    • stem

      public abstract boolean stem()
    • setCurrent

      public void setCurrent(String value)
      Set the current string.
    • getCurrent

      public String getCurrent()
      Get the current string.
    • setCurrent

      public void setCurrent(char[] text, int length)
      Set the current string.
      Parameters:
      text - character array containing input
      length - valid length of text.
    • getCurrentBuffer

      public char[] getCurrentBuffer()
      Get the current buffer containing the stem.

      NOTE: this may be a reference to a different character array than the one originally provided with setCurrent, in the exceptional case that stemming produced a longer intermediate or result string.

      It is necessary to use getCurrentBufferLength() to determine the valid length of the returned buffer. For example, many words are stemmed simply by subtracting from the length to remove suffixes.

      See Also:
    • getCurrentBufferLength

      public int getCurrentBufferLength()
      Get the valid length of the character array in getCurrentBuffer().
      Returns:
      valid length of the array.
    • copy_from

      protected void copy_from(SnowballProgram other)
    • in_grouping

      protected boolean in_grouping(char[] s, int min, int max)
    • in_grouping_b

      protected boolean in_grouping_b(char[] s, int min, int max)
    • out_grouping

      protected boolean out_grouping(char[] s, int min, int max)
    • out_grouping_b

      protected boolean out_grouping_b(char[] s, int min, int max)
    • in_range

      protected boolean in_range(int min, int max)
    • in_range_b

      protected boolean in_range_b(int min, int max)
    • out_range

      protected boolean out_range(int min, int max)
    • out_range_b

      protected boolean out_range_b(int min, int max)
    • eq_s

      protected boolean eq_s(int s_size, CharSequence s)
    • eq_s_b

      protected boolean eq_s_b(int s_size, CharSequence s)
    • eq_v

      protected boolean eq_v(CharSequence s)
    • eq_v_b

      protected boolean eq_v_b(CharSequence s)
    • find_among

      protected int find_among(Among[] v, int v_size)
    • find_among_b

      protected int find_among_b(Among[] v, int v_size)
    • replace_s

      protected int replace_s(int c_bra, int c_ket, CharSequence s)
    • slice_check

      protected void slice_check()
    • slice_from

      protected void slice_from(CharSequence s)
    • slice_del

      protected void slice_del()
    • insert

      protected void insert(int c_bra, int c_ket, CharSequence s)
    • slice_to

      protected StringBuilder slice_to(StringBuilder s)
    • assign_to

      protected StringBuilder assign_to(StringBuilder s)