Class TokenInfoDictionaryBuilder
java.lang.Object
org.apache.lucene.analysis.ja.util.TokenInfoDictionaryBuilder
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Stringprivate final DictionaryBuilder.DictionaryFormatprivate final Normalizer.Formprivate intInternal word id - incrementally assigned as entries are read and added. -
Constructor Summary
ConstructorsConstructorDescriptionTokenInfoDictionaryBuilder(DictionaryBuilder.DictionaryFormat format, String encoding, boolean normalizeEntries) -
Method Summary
Modifier and TypeMethodDescriptionprivate TokenInfoDictionaryWriterbuildDictionary(List<Path> csvFiles) private String[]formatEntry(String[] features)
-
Field Details
-
encoding
-
normalForm
-
format
-
offset
private int offsetInternal word id - incrementally assigned as entries are read and added. This will be byte offset of dictionary file
-
-
Constructor Details
-
TokenInfoDictionaryBuilder
public TokenInfoDictionaryBuilder(DictionaryBuilder.DictionaryFormat format, String encoding, boolean normalizeEntries)
-
-
Method Details
-
build
- Throws:
IOException
-
buildDictionary
- Throws:
IOException
-
formatEntry
-