Package org.apache.lucene.analysis.ja
package org.apache.lucene.analysis.ja
Analyzer for Japanese.
-
ClassDescriptionOutputs the dot (graphviz) string for the viterbi lattice.Analyzer for Japanese that uses morphological analysis.Atomically loads DEFAULT_STOP_SET, DEFAULT_STOP_TAGS in a lazy fashion once the outer class accesses the static final set the first time.Replaces term text with the
BaseFormAttribute.Factory forJapaneseBaseFormFilter.Normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.Factory forJapaneseIterationMarkCharFilter.ATokenFilterthat normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).Factory forJapaneseKatakanaStemFilter.ATokenFilterthat normalizes Japanese numbers (kansūji) to regular Arabic decimal numbers in half-width characters.Buffer that holds a Japanese number string and a position index used as a parsed-to markerFactory forJapaneseNumberFilter.Removes tokens that match a set of part-of-speech tags.Factory forJapanesePartOfSpeechStopFilter.ATokenFilterthat replaces the term attribute with the reading of a token in either katakana or romaji form.Factory forJapaneseReadingFormFilter.Tokenizer for Japanese that uses morphological analysis.Tokenization mode: this determines how the tokenizer handles compound and unknown words.Token type reflecting the original source of this tokenFactory forJapaneseTokenizer.Analyzed token with morphological data from its dictionary.