Tinderbox v10 Icon

Lexical vs. numerical sorting

Tinderbox containers can sort their contents, as can agents. In addition action code offers methods for sorting lists. Sorting generally occurs in one of two forms, lexical or numerical.

Sorting can be set via action code or more normally via the Sort tab of the Action Inspector.

Lexical sort. A lexical sorts characters in broadly alphanumeric order, for unaccented Roman alphabet languages like English. In fact such sorts look at the underlying ASCII/Unicode character number and sort from lowest to highest for each character, in turn of a word or string of characters. Thus numbers always sort before uppercase letters and upper case before lower case letters. Accented characters come after that. This unusual order reflects the numerical sequence of codes used indicate different letters symbols and numbers. This order has several odd effects:

Numerical sort. Only used for number sequences. Here the numerical values of the whole number string is computed and these values sorted in ascending numerical sequence order. Thus the order 1,2,10,11,13,120 not the lexical order of 1, 10, 11, 120, 13, 2.

Dates. Date sort, in date order naturally. The exact form is neither strictly lexical or numerical but Tinderbox takes care of date sort correctly.

Transforms

To work around some of the limitations of basic lexical sorts, as seen from a human perspective, Tinderbox also offers some 'transforms' which tweak the way sorting occurs:

Sorting in accented/non-roman text languages

This may likely not be as expected due to the limitations of lexical sorts which are not, without further manipulation ('collation'), aware of per-language sorting nuances. This area of the application is noted as having scope for improvement and likely more locale-specific collation will become available in due course.

So for other characters, accents, etc., sorts may not meet linguistic expectation as the values will be based on Unicode sort order. Thus:

"dog" > "cat"

"dog" > "Dog"

"dogs" > "dog"

"dogs" > "dogma"

"dogs" < "døg" <-- NOTE!

The prevailing locale's sorting rules for handling diacritics and accents.

Tinderbox will use the OS' localisation settings to determine what rules apply for the sorting of accented and other characters such as a ß. If it is desirable to sort using a different localisation, consider use of locale() to alter the local Tinderbox environment.

Sorting and Lists/Sets

The discrete values are sorted such they are listed in lexical sort order.