![]() |
Leptonica 1.85.0
Image processing and image analysis suite
|
Go to the source code of this file.
Functions | |
| SARRAY * | sarraySort (SARRAY *saout, SARRAY *sain, l_int32 sortorder) |
| SARRAY * | sarraySortByIndex (SARRAY *sain, NUMA *naindex) |
| l_int32 | stringCompareLexical (const char *str1, const char *str2) |
| L_ASET * | l_asetCreateFromSarray (SARRAY *sa) |
| l_ok | sarrayRemoveDupsByAset (SARRAY *sas, SARRAY **psad) |
| l_ok | sarrayUnionByAset (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
| l_ok | sarrayIntersectionByAset (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
| L_HASHMAP * | l_hmapCreateFromSarray (SARRAY *sa) |
| l_ok | sarrayRemoveDupsByHmap (SARRAY *sas, SARRAY **psad, L_HASHMAP **phmap) |
| l_ok | sarrayUnionByHmap (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
| l_ok | sarrayIntersectionByHmap (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
| SARRAY * | sarrayGenerateIntegers (l_int32 n) |
| l_ok | sarrayLookupCSKV (SARRAY *sa, const char *keystring, char **pvalstring) |
Sort
SARRAY *sarraySort()
SARRAY *sarraySortByIndex()
l_int32 stringCompareLexical()
Set operations using aset (rbtree)
L_ASET *l_asetCreateFromSarray()
l_int32 sarrayRemoveDupsByAset()
l_int32 sarrayUnionByAset()
l_int32 sarrayIntersectionByAset()
Hashmap operations
L_HASHMAP *l_hmapCreateFromSarray()
l_int32 sarrayRemoveDupsByHmap()
l_int32 sarrayUnionByHmap()
l_int32 sarrayIntersectionByHmap()
Miscellaneous operations
SARRAY *sarrayGenerateIntegers()
l_int32 sarrayLookupCSKV()
We have two implementations of set operations on an array of strings:
(1) Using an underlying tree (rbtree).
This uses a good 64 bit hashing function for the key,
that is not expected to have hash collisions (and we do
not test for them). The tree is built up of the hash
values, and if the hash is found in the tree, it is
assumed that the string has already been found.
(2) Building a hashmap from the keys (hashmap).
This uses a fast 64 bit hashing function for the key, which
is then hashed into a hashtable. Collisions of hashkeys are
very rare, but the hashtable is designed to allow more than one
hashitem in a table entry. The hashitems are put in a list at
each hashtable entry, which is traversed looking for the key.
Definition in file sarray2.c.
| [in] | sa |
Definition at line 224 of file sarray2.c.
References L_NOCOPY.
Referenced by sarrayIntersectionByAset().
| [in] | sa | input sarray |
Definition at line 426 of file sarray2.c.
References L_NOCOPY.
Referenced by sarrayIntersectionByHmap(), and sarrayRemoveDupsByHmap().
| SARRAY * sarrayGenerateIntegers | ( | l_int32 | n | ) |
| [in] | sa1 | |
| [in] | sa2 | |
| [out] | psad | intersection of the two string arrays |
Notes:
(1) Algorithm: put the larger sarray into a set, using the string
hashes as the key values. Then run through the smaller sarray,
building an output sarray and a second set from the strings
in the larger array: if a string is in the first set but
not in the second, add the string to the output sarray and hash
it into the second set. The second set is required to make
sure only one instance of each string is put into the output sarray.
This is O(mlogn), {m,n} = sizes of {smaller,larger} input arrays.
Definition at line 369 of file sarray2.c.
References l_asetCreateFromSarray(), L_COPY, and L_NOCOPY.
| [in] | sa1 | |
| [in] | sa2 | |
| [out] | *psad | intersection of the array values |
Definition at line 540 of file sarray2.c.
References L_COPY, l_hmapCreateFromSarray(), L_NOCOPY, and sarrayRemoveDupsByHmap().
| l_ok sarrayLookupCSKV | ( | SARRAY * | sa, |
| const char * | keystring, | ||
| char ** | pvalstring ) |
| [in] | sa | of strings, each being a comma-separated pair of strings, the first being a key and the second a value |
| [in] | keystring | an input string to match with each key in sa |
| [out] | pvalstring | the returned value string corresponding to the input key string, if found; otherwise NULL |
Notes:
(1) The input sa can have other strings that are not in
comma-separated key-value format. These will be ignored.
(2) This returns a copy of the first value string in sa whose
key string matches the input keystring.
(3) White space is not ignored; all white space before the ','
is used for the keystring in matching. This allows the
key and val strings to have white space (e.g., multiple words).
Definition at line 641 of file sarray2.c.
References L_NOCOPY.
| [in] | sas | |
| [out] | psad | with duplicates removed |
Notes:
(1) This is O(nlogn), considerably slower than
sarrayRemoveDupsByHmap() for large string arrays.
(2) The key for each string is a 64-bit hash.
(3) Build a set, using hashed strings as keys. As the set is
built, first do a find; if not found, add the key to the
set and add the string to the output sarray.
Definition at line 266 of file sarray2.c.
References L_COPY, and L_NOCOPY.
Referenced by sarrayUnionByAset().
| [in] | sas | |
| [out] | psad | hash set of unique values |
| [out] | phmap | [optional] hashmap used for lookup |
Definition at line 457 of file sarray2.c.
References L_Hashmap::hashtab, L_COPY, l_hmapCreateFromSarray(), L_INSERT, L_Hashitem::next, L_Hashmap::tabsize, and L_Hashitem::val.
Referenced by sarrayIntersectionByHmap(), and sarrayUnionByHmap().
| [in] | saout | output sarray; can be NULL or equal to sain |
| [in] | sain | input sarray |
| [in] | sortorder | L_SORT_INCREASING or L_SORT_DECREASING |
Notes:
(1) Set saout = sain for in-place; otherwise, set naout = NULL.
(2) Shell sort, modified from K&R, 2nd edition, p.62.
Slow but simple O(n logn) sort.
Definition at line 98 of file sarray2.c.
References Sarray::array, L_SORT_DECREASING, L_SORT_INCREASING, and stringCompareLexical().
| [in] | sa1 | |
| [in] | sa2 | |
| [out] | psad | union of the two string arrays |
Notes:
(1) Duplicates are removed from the concatenation of the two arrays.
(2) The key for each string is a 64-bit hash.
(2) Algorithm: Concatenate the two sarrays. Then build a set,
using hashed strings as keys. As the set is built, first do
a find; if not found, add the key to the set and add the string
to the output sarray. This is O(nlogn).
Definition at line 320 of file sarray2.c.
References sarrayRemoveDupsByAset().
| [in] | sa1 | |
| [in] | sa2 | |
| [out] | *psad | union of the array values |
Definition at line 506 of file sarray2.c.
References sarrayRemoveDupsByHmap().
| l_int32 stringCompareLexical | ( | const char * | str1, |
| const char * | str2 ) |
| [in] | str1 | |
| [in] | str2 |
Notes:
(1) If the lexical values are identical, return a 0, to
indicate that no swapping is required to sort the strings.
Definition at line 184 of file sarray2.c.
Referenced by sarraySort().