Question about mkvmerge and external timestamp files

Dear all,

when studying the documentation to mkvmerge, I was very excited to see that it supports external timestamp files. This is ingenious, because it enables the end user to solve problems or to conduct tests that otherwise wouldn’t be possible without using special software.

But when joining multiple input files, an external timestamp file can only be applied to the first input file, according to the documentation.

Since I’d like to try a certain workflow that would require me to set every timestamp for every video and audio frame from every input file, I’d like to understand where the limitation to the first input file comes from, and would be grateful if somebody could shortly explain it.

Actually, I’m thinking of opening a feature request for an extension of the external timestamp file mechanism so that a timestamp file could be applied to multiple input files (or so that there could be one timestamp file per input file). But I really don’t want to bother somebody with it as long as I haven’t understood the technical situation.

Thank you very much in advance, and best regards!

I’m not sure, but I think that you specify a timestamp file for the first file only, but that file must contain the timestamps for all the frames of the subsequently appended tracks.

Thanks for the fast reply.

That’s great. I then have misunderstood the documentation. I’ll try the external timestamp files in the next days and then report back.

Best regards!

In the meantime, I have come up with a workflow and a custom utility that produces the external timestamp files in the way I need them.

The good thing is that mkvmerge in principle processes those files. I now can confirm that indeed we need the external timestamp files for each stream to contain the timestamps for all frames from all files that we want to concatenate, and that the external timestamp files must be specified for the first file only. In other words, exactly what @mbunkus has written, applied to each stream.

Unfortunately, my effort was pointless because there’s a small detail that does not work the way I would need it; it would have been the key point in this case. It is best explained by example. Suppose the following situation:

You have two input files that you want to concatenate. Both input files contain an audio stream that consists of 3 audio frames. Hence, there are six audio frames in whole. You have the following external timestamp file for the audio stream:

# timestamp format v2

Here, the third timestamp belongs to the last audio frame from the first input file, and the fourth timestamp belongs to the first audio frame from the second input file. As we can see, there is only 26 ms between the third and the fourth timestamp, although an audio frame takes 32 ms.

Now, when concatenating both input files to a .mkv output file, using the external timestamp file, mkvmerge obviously detects that a whole audio frame does not fit between the third and the fourth timestamp, and thus throws away the third audio frame. This produces a gap of 26 ms in the audio stream.

This makes it impossible for me to reach my actual goal: I’d like to test how players behave if they meet the situation described above; that is, if they encounter an audio frame with a timestamp that is earlier than (previous timestamp plus frame duration).

Hence the questions:

  1. Can somebody please confirm that I have interpreted mkvinfo’s output correctly (that is, that mkvmerge really throws away audio frames in situations like described above)?

  2. If yes, Is there a trick or command line switch that makes mkvmerge keep those frames? Not that I haven’t read the docs, but perhaps I’ve missed something …

Thank you very much in advance, and best regards.

Instruct mkvinfo to emit frame checksums with -v -c or use the summary with -s which also includes a frame checksum. That’s the most effective way of identifying frames.

mkvmerge usually only discards frames if the calculated timestamp is < 0.


I see. Thank you very much. I’ll keep that in mind for the future (but see my remark below after the next citation).

However, the problem here is that my input files (that I would like to concatenate) are M2TS, which mkvinfo naturally doesn’t process. This probably means that at first every M2TS file must be converted to MKV on its own. Afterwards, I can build the set of all checksums from the individual input files, then build the set of all checksums from my already existing final MKV file, and then check whether both sets are identical.

So it’s a bit of work, but it surely is very reliable and saves me from manually searching for frames in mkvinfos’s output. Thanks again for the tip!

Silly me! My observation was wrong, and mkvmerge works as intended. The attached screenshot shows my embarassing mistake.

The third cluster denotes a new M2TS input file. We can see that the first frame of all audio streams has the same timestamp as the cluster itself; the same is true for the video stream. This actually is what I wanted to achieve with the help of the external timestamp files.

However, in the cluster before (the second cluster in the screenshot), the lastest audio frame in track 2 as well as in track 3 ends before the new cluster begins, and hence there is an audio gap. Well, not really.

When looking at the mkvinfo GUI output the first time, I was a bit excited about all that stuff and totally missed that there are two additional block groups in the second cluster. In contrast to single blocks, block groups do not show the contained frames unless they are expanded, which is not the case by default.

This morning I realized my mistake and expanded the two block groups. And of course, the missing audio frames were there.

I also have understood why the last frames of the two audio streams have been put into block groups: mkvmerge has automatically tagged the block groups with the correct duration. This is necessary because those last audio frames actually end after the timestamp of the next audio frame (in the next cluster). Setting the correct block duration hopefully makes a player stop outputting the respective audio frame after the block duration and seamlessly switch to the next audio frame.

In summary, external timestamp files make it possible to “simulate” the feature I was requesting here. This is absolutely great! A big kudo for those files …

Now I’ll investigate whether this method actually solves the nasty problems I’m after, but this depends on the behavior of the players (in particular, whether the players respect the block duration in the sense explained above, or if not, how they cope with a situation where the timestamp of the next audio frame is before the end of the current audio frame).

Best regards!