Package org.apache.lucene.analysis.core
Class StopFilterFactory
java.lang.Object
org.apache.lucene.analysis.util.AbstractAnalysisFactory
org.apache.lucene.analysis.util.TokenFilterFactory
org.apache.lucene.analysis.core.StopFilterFactory
- All Implemented Interfaces:
ResourceLoaderAware
Factory for
StopFilter.
<fieldType name="text_stop" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" format="wordset"
</analyzer>
</fieldType>
All attributes are optional:
ignoreCasedefaults tofalsewordsshould be the name of a stopwords file to parse, if not specified the factory will useEnglishAnalyzer.ENGLISH_STOP_WORDS_SETformatdefines how thewordsfile will be parsed, and defaults towordset. Ifwordsis not specified, thenformatmust not be specified.
The valid values for the format option are:
wordset- This is the default format, which supports one word per line (including any intra-word whitespace) and allows whole line comments beginning with the "#" character. Blank lines are ignored. SeeWordlistLoader.getLinesfor details.snowball- This format allows for multiple words specified on each line, and trailing comments may be specified using the vertical line ("|"). Blank lines are ignored. SeeWordlistLoader.getSnowballWordSetfor details.
- Since:
- 3.1
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Stringstatic final Stringstatic final Stringprivate final booleanstatic final StringSPI nameprivate final Stringprivate CharArraySetFields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncreate(TokenStream input) Transform the specified input TokenStreamvoidinform(ResourceLoader loader) Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).booleanMethods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFiltersMethods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
FORMAT_WORDSET
- See Also:
-
FORMAT_SNOWBALL
- See Also:
-
stopWords
-
stopWordFiles
-
format
-
ignoreCase
private final boolean ignoreCase
-
-
Constructor Details
-
StopFilterFactory
Creates a new StopFilterFactory
-
-
Method Details
-
inform
Description copied from interface:ResourceLoaderAwareInitializes this component with the provided ResourceLoader (used for loading classes, files, etc).- Specified by:
informin interfaceResourceLoaderAware- Throws:
IOException
-
isIgnoreCase
public boolean isIgnoreCase() -
getStopWords
-
create
Description copied from class:TokenFilterFactoryTransform the specified input TokenStream- Specified by:
createin classTokenFilterFactory
-