java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
- All Implemented Interfaces:
Closeable,AutoCloseable
- Direct Known Subclasses:
ArabicAnalyzer,ArmenianAnalyzer,BasqueAnalyzer,BengaliAnalyzer,BrazilianAnalyzer,BulgarianAnalyzer,CatalanAnalyzer,CJKAnalyzer,ClassicAnalyzer,CzechAnalyzer,DanishAnalyzer,EnglishAnalyzer,EstonianAnalyzer,FinnishAnalyzer,FrenchAnalyzer,GalicianAnalyzer,GermanAnalyzer,GreekAnalyzer,HindiAnalyzer,HungarianAnalyzer,IndonesianAnalyzer,IrishAnalyzer,ItalianAnalyzer,JapaneseAnalyzer,LatvianAnalyzer,LithuanianAnalyzer,NepaliAnalyzer,NorwegianAnalyzer,PersianAnalyzer,PolishAnalyzer,PortugueseAnalyzer,RomanianAnalyzer,RussianAnalyzer,SerbianAnalyzer,SoraniAnalyzer,SpanishAnalyzer,StandardAnalyzer,StopAnalyzer,SwedishAnalyzer,TamilAnalyzer,TeluguAnalyzer,ThaiAnalyzer,TurkishAnalyzer,UAX29URLEmailAnalyzer
Base class for Analyzers that need to make use of stopword sets.
- Since:
- 3.1
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents -
Field Summary
FieldsFields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY, storedValue -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedCreates a new Analyzer with an empty stopword setprotectedStopwordAnalyzerBase(CharArraySet stopwords) Creates a new instance initialized with the given stopword set -
Method Summary
Modifier and TypeMethodDescriptionReturns the analyzer's stopword set or an empty set if the analyzer has no stopwordsprotected static CharArraySetloadStopwordSet(boolean ignoreCase, Class<? extends Analyzer> aClass, String resource, String comment) Deprecated, for removal: This API element is subject to removal in a future version.protected static CharArraySetloadStopwordSet(Reader stopwords) Creates a CharArraySet from a file.protected static CharArraySetloadStopwordSet(Path stopwords) Creates a CharArraySet from a path.Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, createComponents, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReader, initReaderForNormalization, normalize, normalize, tokenStream, tokenStream
-
Field Details
-
stopwords
An immutable stopword set
-
-
Constructor Details
-
StopwordAnalyzerBase
Creates a new instance initialized with the given stopword set- Parameters:
stopwords- the analyzer's stopword set
-
StopwordAnalyzerBase
protected StopwordAnalyzerBase()Creates a new Analyzer with an empty stopword set
-
-
Method Details
-
getStopwordSet
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwords- Returns:
- the analyzer's stopword set or an empty set if the analyzer has no stopwords
-
loadStopwordSet
@Deprecated(forRemoval=true, since="9.1") protected static CharArraySet loadStopwordSet(boolean ignoreCase, Class<? extends Analyzer> aClass, String resource, String comment) throws IOException Deprecated, for removal: This API element is subject to removal in a future version.Class.getResourceAsStream(String)is caller sensitive and cannot load resources across Java Modules. Please call thegetResourceAsStream()andWordlistLoader.getWordSet(Reader, String, CharArraySet)or other methods directly.Creates a CharArraySet from a file resource associated with a class. (SeeClass.getResourceAsStream(String)).- Parameters:
ignoreCase-trueif the set should ignore the case of the stopwords, otherwisefalseaClass- a class that is associated with the given stopwordResourceresource- name of the resource file associated with the given classcomment- comment string to ignore in the stopword file- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException- if loading the stopwords throws anIOException
-
loadStopwordSet
Creates a CharArraySet from a path.- Parameters:
stopwords- the stopwords file to load- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException- if loading the stopwords throws anIOException
-
loadStopwordSet
Creates a CharArraySet from a file.- Parameters:
stopwords- the stopwords reader to load- Returns:
- a CharArraySet containing the distinct stopwords from the given reader
- Throws:
IOException- if loading the stopwords throws anIOException
-
Class.getResourceAsStream(String)is caller sensitive and cannot load resources across Java Modules.