Package org.apache.lucene.classification
Class KNearestFuzzyClassifier
java.lang.Object
org.apache.lucene.classification.KNearestFuzzyClassifier
- All Implemented Interfaces:
Classifier<BytesRef>
A k-Nearest Neighbor classifier based on
NearestFuzzyQuery.-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Analyzerprivate final Stringthe name of the field used as the output textprivate final IndexSearcheranIndexSearcherused to perform queriesprivate final intthe no.private final QueryaQueryused to filter the documents that should be used from this classifier's underlyingLeafReaderprivate final String[]the name of the fields used as the input text -
Constructor Summary
ConstructorsConstructorDescriptionKNearestFuzzyClassifier(IndexReader indexReader, Similarity similarity, Analyzer analyzer, Query query, int k, String classFieldName, String... textFieldNames) Creates aKNearestFuzzyClassifier. -
Method Summary
Modifier and TypeMethodDescriptionassignClass(String text) Assign a class (with score) to the given text Stringprivate List<ClassificationResult<BytesRef>> buildListFromTopDocs(TopDocs topDocs) build a list of classification results from search resultsgetClasses(String text) Get all the classes (sorted by score, descending) assigned to the given text String.getClasses(String text, int max) Get the firstmaxclasses (sorted by score, descending) assigned to the given text String.private TopDocstoString()
-
Field Details
-
textFieldNames
the name of the fields used as the input text -
classFieldName
the name of the field used as the output text -
indexSearcher
anIndexSearcherused to perform queries -
k
private final int kthe no. of docs to compare in order to find the nearest neighbor to the input text -
query
aQueryused to filter the documents that should be used from this classifier's underlyingLeafReader -
analyzer
-
-
Constructor Details
-
KNearestFuzzyClassifier
public KNearestFuzzyClassifier(IndexReader indexReader, Similarity similarity, Analyzer analyzer, Query query, int k, String classFieldName, String... textFieldNames) Creates aKNearestFuzzyClassifier.- Parameters:
indexReader- the reader on the index to be used for classificationsimilarity- theSimilarityto be used by the underlyingIndexSearcherornull(defaults toBM25Similarity)analyzer- anAnalyzerused to analyze unseen textquery- aQueryto eventually filter the docs used for training the classifier, ornullif all the indexed docs should be usedk- the no. of docs to select in the MLT results to find the nearest neighborclassFieldName- the name of the field used as the output for the classifiertextFieldNames- the name of the fields used as the inputs for the classifier, they can contain boosting indication e.g. title^10
-
-
Method Details
-
assignClass
Description copied from interface:ClassifierAssign a class (with score) to the given text String- Specified by:
assignClassin interfaceClassifier<BytesRef>- Parameters:
text- a String containing text to be classified- Returns:
- a
ClassificationResultholding assigned class of typeTand score - Throws:
IOException- If there is a low-level I/O error.
-
getClasses
Description copied from interface:ClassifierGet all the classes (sorted by score, descending) assigned to the given text String.- Specified by:
getClassesin interfaceClassifier<BytesRef>- Parameters:
text- a String containing text to be classified- Returns:
- the whole list of
ClassificationResult, the classes and scores. Returnsnullif the classifier can't make lists. - Throws:
IOException- If there is a low-level I/O error.
-
getClasses
Description copied from interface:ClassifierGet the firstmaxclasses (sorted by score, descending) assigned to the given text String.- Specified by:
getClassesin interfaceClassifier<BytesRef>- Parameters:
text- a String containing text to be classifiedmax- the number of return list elements- Returns:
- the whole list of
ClassificationResult, the classes and scores. Cut for "max" number of elements. Returnsnullif the classifier can't make lists. - Throws:
IOException- If there is a low-level I/O error.
-
knnSearch
- Throws:
IOException
-
buildListFromTopDocs
private List<ClassificationResult<BytesRef>> buildListFromTopDocs(TopDocs topDocs) throws IOException build a list of classification results from search results- Parameters:
topDocs- the search results as aTopDocsobject- Returns:
- a
ListofClassificationResult, one for each existing class - Throws:
IOException- if it's not possible to get the stored value of class field
-
toString
-