Package org.apache.lucene.analysis.ko
Class KoreanAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.ko.KoreanAnalyzer
- All Implemented Interfaces:
Closeable,AutoCloseable
Analyzer for Korean that uses morphological analysis.
- Since:
- 7.4.0
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final KoreanTokenizer.DecompoundModeprivate final booleanprivate final UserDictionaryFields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY -
Constructor Summary
ConstructorsConstructorDescriptionCreates a new KoreanAnalyzer.KoreanAnalyzer(UserDictionary userDict, KoreanTokenizer.DecompoundMode mode, Set<POS.Tag> stopTags, boolean outputUnknownUnigrams) Creates a new KoreanAnalyzer. -
Method Summary
Modifier and TypeMethodDescriptionprotected Analyzer.TokenStreamComponentscreateComponents(String fieldName) Creates a newAnalyzer.TokenStreamComponentsinstance for this analyzer.protected TokenStreamnormalize(String fieldName, TokenStream in) Wrap the givenTokenStreamin order to apply normalization filters.Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, initReaderForNormalization, normalize, setVersion, tokenStream, tokenStream
-
Field Details
-
userDict
-
mode
-
stopTags
-
outputUnknownUnigrams
private final boolean outputUnknownUnigrams
-
-
Constructor Details
-
KoreanAnalyzer
public KoreanAnalyzer()Creates a new KoreanAnalyzer. -
KoreanAnalyzer
public KoreanAnalyzer(UserDictionary userDict, KoreanTokenizer.DecompoundMode mode, Set<POS.Tag> stopTags, boolean outputUnknownUnigrams) Creates a new KoreanAnalyzer.- Parameters:
userDict- Optional: if non-null, user dictionary.mode- Decompound mode.stopTags- The set of part of speech that should be filtered.outputUnknownUnigrams- If true outputs unigrams for unknown words.
-
-
Method Details
-
createComponents
Description copied from class:AnalyzerCreates a newAnalyzer.TokenStreamComponentsinstance for this analyzer.- Specified by:
createComponentsin classAnalyzer- Parameters:
fieldName- the name of the fields content passed to theAnalyzer.TokenStreamComponentssink as a reader- Returns:
- the
Analyzer.TokenStreamComponentsfor this analyzer.
-
normalize
Description copied from class:AnalyzerWrap the givenTokenStreamin order to apply normalization filters. The default implementation returns theTokenStreamas-is. This is used byAnalyzer.normalize(String, String).
-