Tinderbox's new string processing operators (as of v9.1.0) are intended to help extract information from structured and semi-structured text. Such text may be hand-typed, for copied from sources like email. Often, it may be imported from other programs or downloaded from web services into a Tinderbox attribute. The need is to extract needed information from this text.
Regular expressions. Be aware that stream processing operators do not use regular expressions (regex). If regex are needed to complete the task, either use ordinary String processing operators or insert appropriate delimiters into the text before processing.
Broadly speaking, the parsing approach is to begin at the start of the string and proceed, step by step, following a recipe (of chained dot-operators). For example, such a 'recipe' might say:
- Read until you find a line that begins with "To:", "From:", or "Subject:"
- If you find a "To:", copy everything character following that up to the first space character encountered and save the copy in at the current note's $Email.
- If you find a "From:", copy everything character following that up to the first space character encountered and save the copy in at the current note's $EmailFrom.
- If you find a "Subject:", get the rest of the current line and use that for the $Name of the current note.
- Having found a "Subject:, delete all the headers you have processed and leave the rest of the text.
- If you never find a "Subject:", do not delete anything.
All functional string processing operators accept a string, called the stream, of text being processed. They act in some way on the stream possibly saving some data into an attribute or simply moving further forward (left-to-right) and returning the unprocessed remainder (right-most portion of the stream) which may be passed to another operators such as further chained dot-operators. For example:
$MyString.skip(22).captureNumber("MyNumber");
takes the value of MyString, skips exactly 22 characters, and extracts a number to be stored in $MyNumber. For instance MyString holds string "We think there may be 1234 items":
$MyString.skip(22).captureNumber("MyNumber");
$MyNumber is 1234. But if MyString holds string "We think there may be 1,234 items" then $MyNumber is 1 as a comma follows the first number (after the skip operator consumes the first 22 characters.).
The parsing operators can best be understood as a series of discrete roles: