BCSV (File format)
|The content described on this page is 100% documented.|
BCSV stands for Binary Comma Separated Values and is the most common data format used in both Super Mario Galaxy games. Some older GameCube titles, such as Luigi's Mansion and Donkey Kong Jungle Beat, use this data format as well. As the name suggests, BCSV is a binary variant of comma-separated values (CSV). This means that the data is laid out in a table-like structure. The column names are hashed for faster access. The data is flatbuffer-like and is loaded directly into memory, meaning that it does not have to be deserialized first. The game supports reading data as signed and unsigned integers (8, 16 and 32 bit), single-precision floats and strings. All BCSV files are padded to the nearest 32 byte boundary with '@' (0x40). There is no consistent file extension for BCSV data. Instead, the game contains various BCSV, BANMT, BCAM, PA and TBL files. BCSV files that use TBL as their file extension are expected to be sorted in ascending order by some specific field. Each string is a null-terminated SHIFT-JIS (Codepage 932) encoded string.
Each BCSV file starts with a header:
|0x08||u32||Offset to the entry data section|
|0x0C||u32||The size of each entry in bytes|
Right after the header comes the list of fields. The structure of a single field is as follows:
|0x08||u16||Offset to the data under this field in an individual entry|
|0x0A||u8||Data shift amount|
|0x0B||u8||The type of data that this field uses|
Fields may cover one of the following data types:
|Name||ID||Size (in bytes)||Description|
|LONG||0x00||4||32-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.|
|STRING||0x01||32||Embedded string. Deprecated. Use STRING_OFFSET instead.|
|FLOAT||0x02||4||Single-precision floating-point value.|
|LONG_2||0x03||4||32-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.|
|SHORT||0x04||2||16-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.|
|CHAR||0x05||8||8-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.|
|STRING_OFFSET||0x06||4||32-bit offset into string table.|
Field Order & Entry Size
For efficiency and hardware limitations, the field offsets and total entry size are calculated depending on a special ordering of the fields. This only affects the order of the data in an entry and not the order of the fields in this section. When saving, the tool should ensure that the field offsets and total entry size are calculated depending on this order: STRING < FLOAT < LONG < LONG_2 < SHORT < CHAR < STRING_OFFSET. A sample implementation from pyjmap can be found on Github which shows how to calculate these properly.
Contains the individual data entries. The structure of their data is specified by the BCSV's fields. Each entry is aligned to four bytes.
Right after the data comes the string pool which contains all strings used within the BCSV.