Tinderbox v9 Icon

Message: Tinderbox was unable to parse this file

Very occasionally, on opening a file you may get a dialog with a message like:

Tinderbox was unable to parse this file. It may be damaged, or you may need a newer version of Tinderbox. The XML parser said : not-well formed (invalid token) (line 1232).

Whilst the exact line number stated at then end will vary, Tinderbox is telling you that the TBX file's data includes something it can not understand at the line (as referred to) in the source XML of the document. When Tinderbox opens a file, it has to read in all the XML data stored in the TBX document file. If this data has been corrupted in some way such that it is no longer in valid XML format, Tinderbox cannot read past that point and gives this error message.

How might such a thing occur? The error is unusual but causes include copying/pasting text from web pages that mis-declare their encoding, such as with web pages where quote marks show as question marks (e.g. character) or less common accented characters as pairs of random characters. It is not always possible for the Tinderbox to detect that the data it is passed is not what it declares itself to be. This can cause the data to be stored inappropriately in Tinderbox, although the effect does not tend to surface until the document is next opened—which for some users can be hours or days after the triggering event. More technically, data is saved in a form that's not intelligible to the XML parser used to read the data when opening a TBX file. Similarly, AutoFetch of data from badly-encoded pages/feeds/sources can ingest data that has the same effect as above.

The solution is to send the file (or better, a zip of it) to Tinderbox support (info@eastgate.com) and they should be able to fix it and return the document. At times the fix may result in the loss of all or part of $Text (or the affect attribute(s) of the affected note(s), but generally just the offending characters can be excised.

If you are doing some task where this happens more than once (perhaps you have very 'dirty' source material) then ensure you have reviewed your settings for back-ups and autosave. You can always change back to your defaults once the problem is resolved.

For the more technically minded…

If you are confident using a text editor and looking at XML source code you can have a go at fixing this yourself. If you do not have a text (code) editor—do not use TextEdit—a good free option is BBEdit. Now:

For TextWrangler and BBEdit users a slightly more hands-off approach is offered via those apps' "zap gremlins" option. It should prune non-XML-safe characters from the file though you will not know exactly what that is. You will just have to look at the resulting TBX file and guess.