Tinderbox v9 Icon

Query back-references

Using parentheses within a query's pattern it is possible to set up to 9 back-references within the overall pattern. These sub-matches can then be used in the action connected with the query. The most normal means of setting a query pattern with back-references is by using the operators String.contains(pattern) or String.icontains(pattern).

Back-references can be used in actions in several contexts:

NOTE: Although Macros use back-reference style notation for inserting content, in that case the values are drawn from the macro's input parameters rather than from a query. In addition, whilst find() uses queries, the operator uses this to return the paths of matching notes and does not support back-references.

Referring to a back-reference

The method of referring to a back-reference is via a $-prefixed number, $0 through $9. The back-reference $0 always refers to the the whole matched string (or sub-string) for the stated query pattern, i.e. it may match all or part of the target string. $1 to $9 refer to any defined back-references within the overall pattern, as discussed in examples below.

Back-references are returned (i.e. number-referenced) in the order created. The order is usually left-to right in order the parentheses open (note this allows for nesting) but to understand that process better, read up on regular expression back-references.

Do back-references need quoting? No, if $MyString is "This or that", all the following result in a value of "This and that":

$MyString=$MyString.replace("(^.+)or(.+$)","$1and$2"); 

$MyString=$MyString.replace("(^.+)or(.+$)","$1"+"and"+"$2"); 

$MyString=$MyString.replace("(^.+)or(.+$)", $1+"and"+$2); 

Back-references 1: in an agent context

This is an example of an agent query designed to create back-references that can then be used in the agent's query:

query: $Text.contains("email: (\w+([,| |-]*\w*)*)\<([^>]+)\>")

action: $FullName=$1; $Email=$3

The action will set the value of attributes $FullName and $Email using the back-references to pattern found in the currently focused notes $Text (well strictly, the note's alias as this is an agent). So, for a worked example, if the $Text was:

Project X
Brief discussion to finalise resources allocation
Source email: John Doe, on 24/03/2010
Follow up actions: Bob, Mary.

 

…then the above query would give the following back-references:

Back-references 2: using if(){}

Using the same examples as above, an if() usage might look like this (the line breaks are not significant and only for clarity of reading here):

	if($Text.contains("Emailed by: (\w+([,| |-]*\w*)*)<([^>]+)>")){
		$MyString=$0;
		$FullName=$1;
		$Email=$3;
	};

In this method the if() operator holds the query and generates the back-references. These can be used anywhere within, bot only within the operators { } curly braces enclosing the action code. The back-reference could be used in the else { } branch, but the nature of the overall usage (i.e. for back-reference generation) means this is unlikely.

See if() for further back-reference usage examples.

Back-references 3: using string.replace()

The use of string.replace() is to replace part of an existing current string attribute value. The operator can be thought of in terms of $SourceDataString("query","return string") where the "return string" might be one or more back references form the query and may include string literals.

For example, assume $MyString has the value of "AABBCC", form which it is desired to make a value of "BB". Essentially this means deleting all the non-'B' characters. This can be done by capturing the 'B's in a back-reference and using that to replace the original value. Thus to replace the original $MyString value:

$MyString = $MyString.replace(".*(BB).*","$1"); 

Note the $1 back-reference must be inside quotes for the second parameter to work. Alternatively, the altered string can be saved to a different attribute, leaving $MyString unchanged

$AnotherString = $MyString.replace(".*(BB).*","$1"); 

The back-references created here cannot be used except in the second input ('replacement') parameter. Clearly, the applications for using string.replace() are far more limited than when using an if() statement.

See String.replace() for further examples of use of back-references within an action context.

Nesting back-references

Back-references may be nested is side one another (as seen in the opening example above):

Query: $Name.contains("(a(ard))v(ark)")

Action: $MyString =$1; $MyStringA = $2; $MyStringB = $3;

For the matched note the 3 attributes set by the action will hold, in order, "aard", "ard" and "ark". This shows back-references are numbered in the order encountered running left to right and not by some other system such as the level of nesting.

Literal parentheses

Literal parentheses in patterns must be escaped by a backslash. To match "this (that) other", use:

$Text.contains("this \(that\) other") 

To capture "(that)" as back-reference $1:

$Text.contains("this (\(that\)) other") 

Sometimes parentheses are needed, e.g. in the agent example shown earlier above, in order to achieve the right match, but which don't match anything meaningful to back-reference use. Don't worry about that, you don't need to use every back-reference created.

What is the role of $0?

$0 is always the whole matched (sub-)string for the stated attribute value but if the regex pattern creates additional back-references within the query then $1 through $9 may be used to access those additional match sub-strings.

In this case above, $0 is not all of the current note's $Text, the overall source for the query, but rather it is all the text matched within $Text by the regex code in the '.contains("pattern")' operator's pattern.

Often, the pattern matches the entire source so $0 returns the whole source text. The structure of the example above is deliberate, so as show that $0 attaches to the pattern's match rather than simply being the entire text being passed to the regular expression.

Don't worry too much about getting the right number. If new to this sort of work and using a pattern with several back references, you are strongly advised to try it in a small test file first. This makes it easier to make sure:

Returning the match offset position (dot operators only)

If the regular expression pattern used with the contains() family of dot-operators (e.g. String.contains()) is found the function returns the match's offset+1, where offset is the distance from the start of the string to the start of the matched pattern. Formerly, .contains() returned true if the pattern was found. The '+1' modifier ensures that a match at position zero returns a number higher than zero which would otherwise coerce to false. Since the offset+1 is always true, no changes are required in existing documents but the function also gives usable offset information, albeit requiring adjustment for use with zero-based indices such as List.at() or String.substr().



A Tinderbox Reference File : Actions & Rules : Query back-references