|
"Collation" or "Collating sequence" is the strategy used for ordering string values. |
|
While it is obvious that you would sort alphabetically (a to z) and by increasing digits (0 to 9) it is less obvious how you would handle:
|
Upper case vs Lower case
Is an 'a' smaller, larger or equal to an 'A'?
|
|
Digits vs alphabetic characters
Is a '0' smaller or larger than an 'a' or 'A'?
|
|
Accented characters
Is a 'é', 'è' or 'ê' smaller, larger or equal to an 'e'?
And how are the different accents on the same base character ordered?
This can even be language dependant, in spanish for example a 'ñ' is
considered as a separate character following 'n', whereas in french a 'é'
and 'e' would be considered identical.
|
|
Other symbols
How do you treat all the other symbols like - , ; . / + etc.?
|
|
|
Even the above only handles the character by character comparisons but collation could
also involve how to interpret collections of characters in a string.
In a collation sequence for names it could be useful/necessary to handle the name "MacAdam"
and "McAdam" as being equal. Combining digits into a numeric value could also be useful so "100"
would be sorted as a higher value than "82".
And there are many other specific cases where a specific type of collation makes sense.
|
|
For Venice only 3 types of collation are used, all 3 being a contextless character by
character comparison:
Case sensitive |
Each character is sorted by its binary value according to the current Windows
character set. In this collating sequence, an 'A' and an 'a' will be considered as
distinct different characters.
|
Case insensitive |
Similar to "Case sensitive", but the upper case characters 'A' through 'Z' are considered
equal to their lower case version. Note that this shifts the upper case characters
to be sorted AFTER the 6 symbols located between 'Z' and 'a' in the character set,
these being: '[', '\', ']', '^', '_' and '`'.
|
Hiërarchy |
Similar to "Case insensitive" but with the additional rule that all characters are larger
than the dot ('.').
This is used for fields that are represented with an internal hierarchical ordering
(Analytical Account, Article number, Article Group, ...) and ensures that "A.A" would
immediately follow its parent "A" without unwanted values like "A-A" in between those
two.
|
|
|
Remark |
|
Venice assumes the "Windows-1252" code page (also known as "ISO 8859-1",
"ISO/IEC 8859-1", "Western Latin-1", "Latin-1") is active. While it is possible to use Venice
using a different code page (or even mixed code pages), this is not a supported use case.
|
|
Depending on what programming language you use, and how you use the SDK, you
may need to provide custom string comparison functions to correctly compare a
key segment value.
|
|