Package org.apache.lucene.analysis.ja
Class JapaneseAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
org.apache.lucene.analysis.ja.JapaneseAnalyzer
- All Implemented Interfaces:
Closeable,AutoCloseable
Analyzer for Japanese that uses morphological analysis.
- Since:
- 3.6.0
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static classAtomically loads DEFAULT_STOP_SET, DEFAULT_STOP_TAGS in a lazy fashion once the outer class accesses the static final set the first time.Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final JapaneseTokenizer.Modeprivate final UserDictionaryFields inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase
stopwordsFields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY -
Constructor Summary
ConstructorsConstructorDescriptionJapaneseAnalyzer(UserDictionary userDict, JapaneseTokenizer.Mode mode, CharArraySet stopwords, Set<String> stoptags) -
Method Summary
Modifier and TypeMethodDescriptionprotected Analyzer.TokenStreamComponentscreateComponents(String fieldName) Creates a newAnalyzer.TokenStreamComponentsinstance for this analyzer.static CharArraySetprotected ReaderinitReader(String fieldName, Reader reader) Override this if you want to add a CharFilter chain.protected ReaderinitReaderForNormalization(String fieldName, Reader reader) Wrap the givenReaderwithCharFilters that make sense for normalization.protected TokenStreamnormalize(String fieldName, TokenStream in) Wrap the givenTokenStreamin order to apply normalization filters.Methods inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSetMethods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, normalize, tokenStream, tokenStream
-
Field Details
-
mode
-
stoptags
-
userDict
-
-
Constructor Details
-
JapaneseAnalyzer
public JapaneseAnalyzer() -
JapaneseAnalyzer
public JapaneseAnalyzer(UserDictionary userDict, JapaneseTokenizer.Mode mode, CharArraySet stopwords, Set<String> stoptags)
-
-
Method Details
-
getDefaultStopSet
-
getDefaultStopTags
-
createComponents
Description copied from class:AnalyzerCreates a newAnalyzer.TokenStreamComponentsinstance for this analyzer.- Specified by:
createComponentsin classAnalyzer- Parameters:
fieldName- the name of the fields content passed to theAnalyzer.TokenStreamComponentssink as a reader- Returns:
- the
Analyzer.TokenStreamComponentsfor this analyzer.
-
normalize
Description copied from class:AnalyzerWrap the givenTokenStreamin order to apply normalization filters. The default implementation returns theTokenStreamas-is. This is used byAnalyzer.normalize(String, String). -
initReader
Description copied from class:AnalyzerOverride this if you want to add a CharFilter chain.The default implementation returns
readerunchanged.- Overrides:
initReaderin classAnalyzer- Parameters:
fieldName- IndexableField name being indexedreader- original Reader- Returns:
- reader, optionally decorated with CharFilter(s)
-
initReaderForNormalization
Description copied from class:AnalyzerWrap the givenReaderwithCharFilters that make sense for normalization. This is typically a subset of theCharFilters that are applied inAnalyzer.initReader(String, Reader). This is used byAnalyzer.normalize(String, String).- Overrides:
initReaderForNormalizationin classAnalyzer
-