Taken from: https://motchallenge.net/instructions/ File Format Please submit your results as a single .zip file. The results for each sequence must be stored in a separate .txt file in the archive's root folder. The file name must be exactly like the sequence name (case sensitive). The file format should be the same as the ground truth file, which is a CSV text-file containing one object instance per line. Each line must contain 10 values: , , , , , , , , , The conf value contains the detection confidence in the det.txt files. For the ground truth, it acts as a flag whether the entry is to be considered. A value of 0 means that this particular instance is ignored in the evaluation, while any other value can be used to mark it as active. For submitted results, all lines in the .txt file are considered. The world coordinates x,y,z are ignored for the 2D challenge and can be filled with -1. Similarly, the bounding boxes are ignored for the 3D challenge. However, each line is still required to contain 10 values. All frame numbers, target IDs and bounding boxes are 1-based. Here is an example: Tracking with bounding boxes (MOT15, MOT16, MOT17, MOT20) 1, 3, 794.27, 247.59, 71.245, 174.88, -1, -1, -1, -1 1, 6, 1648.1, 119.61, 66.504, 163.24, -1, -1, -1, -1 1, 8, 875.49, 399.98, 95.303, 233.93, -1, -1, -1, -1 ... Multi Object Tracking & Segmentation (MOTS Challenge) Each line of an annotation txt file is structured like this (where rle means run-length encoding from COCO): time_frame id class_id img_height img_width rle An example line from a txt file: 52 1005 1 375 1242 WSV:2d;1O10000O10000O1O100O100O1O100O1000000000000000O100O102N5K00O1O1N2O110OO2O001O1NTga3 Meaning: time frame 52 object id 1005 (meaning class id is 1, i.e. car and instance id is 5) class id 1 image height 375 image width 1242 rle WSV:2d;1O10000O10000O1O100O100O1O100O1000000000000000O100O...1O1N image height, image width, and rle can be used together to decode a mask using cocotools(https://github.com/cocodataset/cocoapi) .