every file starts with an unsigned 8 bit integer in big-endian format, describing the length of the JSON metadata for the file. the metadata which follows is composed of a messagepack object, containing three top level keys: video_tracks, subtitle_tracks and attachments.
video tracks are described by objects with the following keys.
key | description |
---|---|
name (optional) | name for the track |
color_mode | color mode used for encoding the track, either 'True' or 'EightBit' |
compression | compression used on this track. may be 'None' or 'Zstd' |
width | width of the video track |
height | width of the video track |
codec_private (optional) | codec specific data, as raw bytes. currently unused |
encode_time | time track was encoded, in seconds since the unix epoch |
index | the stream index, which is encoded into every packet. does not necessarily correspond to the track's index in subtitle_tracks. |
subtitle tracks are described by objects with the following keys.
key | description |
---|---|
name (optional) | name for the track |
encode_time | time track was encoded, in seconds since the unix epoch |
format | format of the subtitle track. may be 'SubRip', 'SubStationAlpha', or 'Unknown'. |
codec_private (optional) | codec specific data, as raw bytes. if format is 'SubStationAlpha', this will contain the style and info headers. |
index | the stream index, which is encoded into every packet. does not necessarily correspond to the track's index in subtitle_tracks. |
attachments are miscellaneous files and other data attached to the file, but that are not encoded into the packet format.
the field is an array of an Attachment enum.
enum variant | data | comment |
---|---|---|
Binary | byte array | catch-all attachment type for misc. binary data. |
Midi | byte array | a MIDI track; as the bytes of a standard MIDI file. |
the contents of the file are composed of packets of data, which contain the data of e.g a video frame. they are written in order to the container file. they are encoded as follows:
field | data type |
---|---|
packet length | 64bit unsigned int |
compression marker | 8bit unsigned int |
uncompressed size (only present if data is compressed) | 64bit unsigned int |
stream index | 32bit unsigned int |
adler32 checksum | 32bit unsigned int |
presentation timestamp (ns) | 64bit unsigned int |
duration (ns) | 64bit unsigned int |
the duration of the packet is undefined; the presentation timestamp defines when the frame should be displayed to the user relative to the start of the video.
video frames are encoded utf8 strings containing a series of half-block unicode characters, where the upper half is
colored using a foreground-setting ANSI escape code and the lower half is colored using a background-setting one,
displaying two 'pixels' with one character.
if the packet belongs to a SubRip subtitle track, the data will be the utf-8 encoded text of the subtitle. the start of the subtitle is used to set the presentation timestamp, and the end of the subtitle is used to set the subtitle duration.
if the packet belongs to a SubStationAlpha subtitle track, the data will be a utf-8 encoded SSA entry, stored as in the Matroska container: in the format 'ReadOrder, Layer, Style, Name, MarginL, MarginR, MarginV, Effect, Text'.
the start of the subtitle is used to set the presentation timestamp, and the end of the subtitle is used to set the packet duration.