This version is out of date, covering development from v5.0.0 to v5.12.2. It is maintained here only for inbound reference links from elsewhere.

Jump to the current version of aTbRef.

Tinderbox Icon

Sorting - lexical and numeric

Tinderbox containers can sort their contents, as can agents. In addition action code offers methods for sorting lists. Sorting generally occurs in one of two forms, lexical or numerical.

Lexical sort. A lexical sorts characters in broadly alphanumeric order, for unaccented Roman alphabet languages like English. In fact such sorts look at the underlying ASCII/Unicode character number and sort from lowest to highest for each character, in turn of a word or string of characters. Thus numbers always sort before uppercase letters and upper case before lower case letters. Accented characters come after that. This unusual order reflects the numerical sequence of codes used indicate different letters symbols and numbers. This order has several odd effects:

Numerical sort. Only used for number sequences. Here the numerical values of the whole number string is computed and these values sorted in ascending numerical sequence order. Thus the order 1,2,10,11,13,120 not the lexical order of 1, 10, 11, 120, 13, 2.

Dates. Date sort, in date order naturally. The exact form is neither strictly lexical or numerical but Tinderbox takes care of date sort correctly.

Transforms

To work around some of the limitations of basic lexical sorts, as seen from a human perspective, Tinderbox also offers some 'transforms' which tweak the way sorting occurs:

Sorting in accented/non-roman text languages

This may likely not be as expected due to the limitations of lexical sorts which are not, without further manipulation ('collation'), aware of per-language sorting nuances. This area of the application is noted as having scope for improvement and likely more locale-specific collation will become available in due course.

So for other characters, accents, etc., sorts may not meet linguistic expectation as the values will be based on Unicode sort order. Thus:

"dog" > "cat"

"dog" > "Dog"

"dogs" > "dog"

"dogs" > "dogma"

"dogs" < "døg" <-- NOTE!

Sorting and Lists/Sets

The discrete values are sorted such they are listed in lexical sort order.


Possible relevant notes (via "Similar Notes" feature):


A Tinderbox Reference File : Objects & Concepts : Coding conventions : Sorting - lexical and numeric