Class IntersectionSimilarity<T>
java.lang.Object
org.apache.commons.text.similarity.IntersectionSimilarity<T>
- Type Parameters:
T- the type of the elements extracted from the character sequence
- All Implemented Interfaces:
SimilarityScore<IntersectionResult>
public class IntersectionSimilarity<T>
extends Object
implements SimilarityScore<IntersectionResult>
Measures the intersection of two sets created from a pair of character sequences.
It is assumed that the type T correctly conforms to the requirements for storage
within a Set or HashMap. Ideally the type is immutable and implements
Object.equals(Object) and Object.hashCode().
- Since:
- 1.7
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static final classMutable counter class for storing the count of elements.private classA minimal implementation of a Bag that can store elements and a count. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Function<CharSequence, Collection<T>> The converter used to create the elements from the characters. -
Constructor Summary
ConstructorsConstructorDescriptionIntersectionSimilarity(Function<CharSequence, Collection<T>> converter) Create a new intersection similarity using the provided converter. -
Method Summary
Modifier and TypeMethodDescriptionapply(CharSequence left, CharSequence right) Calculates the intersection of two character sequences passed as input.private static <T> intgetIntersection(Set<T> setA, Set<T> setB) Computes the intersection between two sets.private intgetIntersection(IntersectionSimilarity<T>.TinyBag bagA, IntersectionSimilarity<T>.TinyBag bagB) Computes the intersection between two bags.private IntersectionSimilarity<T>.TinyBagtoBag(Collection<T> objects) Converts the collection to a bag.
-
Field Details
-
converter
The converter used to create the elements from the characters.
-
-
Constructor Details
-
IntersectionSimilarity
Create a new intersection similarity using the provided converter.If the converter returns a
Setthen the intersection result will not include duplicates. Any otherCollectionis used to produce a result that will include duplicates in the intersect and union.- Parameters:
converter- the converter used to create the elements from the characters- Throws:
IllegalArgumentException- if the converter is null
-
-
Method Details
-
getIntersection
Computes the intersection between two sets. This is the count of all the elements that are within both sets.- Type Parameters:
T- the type of the elements in the set- Parameters:
setA- the set AsetB- the set B- Returns:
- The intersection
-
apply
Calculates the intersection of two character sequences passed as input.- Specified by:
applyin interfaceSimilarityScore<T>- Parameters:
left- first character sequenceright- second character sequence- Returns:
- The intersection result
- Throws:
IllegalArgumentException- if either input sequence isnull
-
getIntersection
private int getIntersection(IntersectionSimilarity<T>.TinyBag bagA, IntersectionSimilarity<T>.TinyBag bagB) Computes the intersection between two bags. This is the sum of the minimum count of each element that is within both sets.- Parameters:
bagA- the bag AbagB- the bag B- Returns:
- The intersection
-
toBag
Converts the collection to a bag. The bag will contain the count of each element in the collection.- Parameters:
objects- the objects- Returns:
- The bag
-