Tinderbox v9 Icon


Operator Type: 

Operator Scope of Action: 

Operator Purpose: 

Operator First Added: 

Operator Last Altered: 

Operator Uses Regular Expressions: 

 Function  [other Function type actions]

 Item  [operators of similar scope]

 Data manipulation  [other Data manipulation operators]


 As at baseline

 [More on regular expressions in Tinderbox]


This operator splits a string into a List, as divided by instances of regular expression pattern regexStr in the original string. Source characters matched by regexStr are not passed to the list. The source string itself is not affected.

regexStr is one of:

Useful regex values are:

The result of the operator is a List-type attribute value, i.e. the data should be passed to a list. Passing the output to a Set-type attribute will de-dupe any list values in the output with the first instance of any duplicates forming its set entry.

For example:

$MyList = [ant bee ant cow].split(" "); gives [ant;bee;ant;cow]

$MySet = [ant bee ant cow].split(" "); gives [ant;bee;cow]

$MyList = [ant, bee, cow].split("\W+ "); gives [ant;bee;cow]

$MyList = [ant, bee, cow].split(" "); gives [ant;bee,;cow]

$MyList = $MyString.split($MyString(agent)); 

$MyList = $MyString(parent).split("and"); 

If the string, stored in $MyString, is multi-line:



$MyList = $MyString.split("\n"); 

gives [ant;bee;cow].

This approach can be useful if trying to retrieve a specific paragraph of $Text, perhaps from notes exploded from a larger consistently formatted text source. To get a string holding just paragraph #3 of the source $Text (or other multi-line string data):

$MyString = $Text.split("\n").at(2); 

Do not overlook the fact that that List.at() is zero-based. That means the first list item is .at(0) and so the third list item is '2' and not '3' as might otherwise be assumedhe last item is '-1':

$MyString = $Text.split("\n").at(-1); 

There is one one limitation of this approach to working with $Text or multi-line strings. The issue is that blank lines or lines with only spaces, are ignored; lists do not hold 'empty' items. So if the string $MyString is multi-line and contains blank lines, like so:



$MyList = $MyString.split("\n"); 

still gives [ant;bee;cow].

It does not matter if the blank is just two successive line returns or actually contains some white space, no list item is created for it.

Luckily there is a simple workaround is to seed empty lines with a single hyphen (or whatever placeholder you prefer, e.g. "N/A" or such). Thus:

$MyList = $Text.replace("\n\n","\n-\n").split("\n"); 

…now gives $MyList [ant;-;bee;cow] such that "bee" is still paragraph #3 of the new list, as in the original text. If you wanted to make a deliberate review of such data you might use a more distinctive marker string:

$MyList = $Text.replace("\n\n","\n#####\n").split("\n"); 

You could then query for $MyList.contains("#####").

Dealing with inline quote characters

Because regex is parsed for regular expressions, it may be possible to use the '\dnn' form described here to work around the lack of escaping from single double quotes within strings.

Dealing with inline semi-colons

As this function outputs a list, where values are semi-colon delimited, if the source string—such as $Text—has semicolons in it they act as extra (unexpected!) splits when viewing the outcome. To get around this, escape the semicolons on the fly:

$MyList = $Text.replace(";","\\;").split("\n"); 

However, the surviving inline semicolons in the resulting List items will get misread, when interrogating the List, as item delimiters. In such cases, first replace inline semicolons with another character(s) before using the split-generated list and then reverse the replacement before actual use of the text. For example:

    var:string vSource = $Text.replace(";","@@@");
    $MyList = vSource.paragraphList;
    $MyString = $MyList.at(4)replace("@@@",";");