Importing CSV data
The BI.Data.CSV unit contains the TBICSV class to import data in "CSV" (Comma Separated Values) format, from files, text strings or streams.
TBICSV attempts to identify some details in the CSV content in an automatic way.
The separator character or text between fields.
The "," delimiter is tested by default. Other delimiters include the tab and space character.
Custom delimiter can also be specified:
var CSV : TBICSV; CSV.Delimiter:= '|'
The character used to surround text values.
The single and double quote characters are automatically tested.
Custom quote character can also be specified:
var CSV : TBICSV; CSV.Quote:= '"';`
TBICSV automatically tests if the first lines of CSV content can be considered the "header" text that contains the name of the CSV fields.
CSV.Header.Headers:= TTextHeaders.Yes; // Auto, Yes, No`
- Decimal separator
Floating point numeric values are attempted to parse using the "." or "," separator character between the integral and decimal parts of the number.
When importing CSV data, each "column" (or "field") is automatically created using the most appropiate TDataKind (integer, floating point, text, boolean, etc)
Several optimizations are used to achieve high speed when importing CSV text.
For example, a huge 3GB CSV file containing 1 billion cells (150k rows by 7k columns) takes 120 seconds to import (using a normal i7 desktop CPU and hard disk).
However, after the data has been imported and saved to TeeBI native binary format, the same file takes only 9 seconds to load.
Parallelization is not currently used on the import phase.