Skip to content

Chunks

The chunk index maps logical grid coordinates to on-disk payload bytes. This is what makes partial I/O and mmap reads efficient.

Chunk index header (32 bytes)

Starts at chunk_index_offset (= align8(40 + dataset_blob_len)).

OffsetSizeFieldNotes
04magicASCII TIDX
44index_versionMust be 1
88entry_countNumber of fixed-size entries
162memory_budget_percent_bpsBasis points (10000 = 100%); 0 = engine default (25%)
182reservedWrite 0
204memory_budget_bytesFixed RAM cap; 0 = use percent of host RAM
248reservedWrite 0

Total index length: 32 + entry_count × 104.

Chunk index entry (104 bytes)

FieldTypeNotes
dataset_idu64Index into dataset directory
chunk_index[8]eight u64Grid coordinates i0 … i7; unused slots are 0
payload_offsetu64File offset to stored bytes
raw_byte_lenu64Logical uncompressed size
stored_byte_lenu64Bytes on disk at payload_offset
codecu320 = raw, 1 = zstd
reservedu320

Payload bytes must satisfy: payload_offset + stored_byte_len ≤ file_len.

Codecs

Codecstored_byte_len vs raw_byte_lenOn-disk bytes
0 rawmust be equaltensor payload (LE bytes for dataset dtype)
1 zstdstored = compressed, raw = decoded sizezstd frame

Mmap-friendly access

The query engine resolves a logical selection to chunk coordinates, then mmap's only the intersecting payload spans:

dataset name → dataset_id → chunk coords → index row → mmap payload

For a full dense selection over raw (codec = 0) payloads, the engine may treat contiguous payloads as one logical byte stream when:

  1. Each chunk's payload_offset + raw_byte_len equals the next chunk's payload_offset (in read-plan order)
  2. The sum of raw_byte_len equals the logical selection size

Reference writers append payloads sequentially, so converted multi-chunk grids typically satisfy this for full-file scans.

Inspecting chunks

bash
tet info file.tet --chunks           # index table (default 32 rows)
tet info file.tet --chunks -n 0      # all rows
tet info file.tet --all              # layout + datasets + chunks + history

See Mmap read patterns for Rust and Python examples.

Latka Industries