Tinderbox 7 Icon

Setting up for Tab-Delimited Import

Originally, data table import was only for tab-delimited data. CSV is not supported as well and the same rules (below) apply to how the data is handled. CSV formatting is only assumed if the file has a '.csv' extension. For '.txt' or '.tsv' files, tab-delimited format is assumed.

For simple data, the auto-generation of attributes from column heads and the data-types assumed are generally correct but for more complexed or nuanced tasks there are some points to bear in mind.

Tinderbox assumes the data will contain a first row of column headers which it can use to map data to existing attributes or generate new ones.

The first column always maps to $Name unless a column header value of 'Name' is found.

Headers are assumed, so if none are suppled, Tinderbox will assume the first data row is the header row, which may result in some strange attribute titles!

Mapping source column headers to existing Tinderbox attributes

Mapping is automatic, based on the Header (first) row of the data. Tinderbox will match a column header, case-sensitively, to any existing system or user attribute. Special cases:

Thus header 'my var' will create a new (String-type) '$my_var' following the naming rules below.

Auto-generated (user) attribute names

If a column-header (other than if column #1) does not match an existing system or user attribute name, then a new user attribute will be created. The new attribute will use the exact (case-sensitive) insofar as the source value matches. 'Illegal' characters for attributes are substituted with a underscore per such character. Thus, source heading 'my / stuff' creates $my___stuff (3 underscores for space+slash+space). Note how the case of legal text is maintained and only each illegal character is substituted by an underscore. Underscores - common in some database tables are thus maintained during import.

Date type coercion on import

These appear to be the 'rules' for coercion, that result is one of only 3 data types being created:

Empty Cells

Empty cells (i.e. content-tab-tab-content) and line breaks in cell content are allowed. If having import problems due to empty cells, try adding a dummy last column of source data with a value in each row and then delete that attribute (and data) once the import is complete. Cells may contain line break characters (e.g. text intended for $Text) if it is enclosed by (straight) double-quotes.

Line Breaks

Tab-delimited. Line breaks are not supported in call values (even for $Text).

CSV. Line breaks are supported, as long as (at least) that cell's value is enclosed with double-quotes.

Quote Characters

Both tab-delimited and CSV ignore single straight quotes and single/double typographic ('curly') quotes. Double-straight quotes:

Importing Lists

As even Tinderbox-formatted lists import by default as strings, to avoid getting the wrong data type, a little extra planning is needed.

Source list data has semi-colon delimited values. Add any needed List or set type attributes (correctly type-configured) to the TBX document before import. If the data is semi-colon delimited it will be correctly parsed as multiple list items.

Source list data does not have semi-colon delimited values. For this you need a two-stage process - ingest the list as a string, then correct the delimiters passing the result to a List or Set attribute. First, ensure the header doesn't use the attribute name actually desired for your List or Set attribute. Make the latter attributes of the necessary type(s). Add a stamp with 2 actions (i.e. with a semi-colon between them. For example, assume the source list uses a '##' string to delimit values in the 'somelist' column that you want to end up in $SomeSet. The stamp might look like this:

$SomeSet = $somelist.replace("##",";"); 

Stamp the ingested notes, check the data. If the list data looks correct, you can delete the source attribute (here $somelist). If using auto-generated key attributes, you may also need to update those to show the correct (list) attributes.

Forcing a non-default data type mapping

As shown above lists cannot be detected and '0' / '1' may be misconstrued as booleans. There are two ways to avoid this:

Import fixes Key Attributes in created notes

The import process sets each new per-row note's $KeyAttributes. If you are going to apply a prototype to these notes you may first wish to reset the new notes' Key Attributes.



A Tinderbox Reference File : Import : Setting up for Tab-Delimited Import