Optimized setup for bulk (batch) handling of files to merge

I am currently merging a number of sets comprised of .mp4 files to a single .mkv file each, using mkvmerge from the command line (merge definitions are precomputed and persisted in a @json file). This all works quite well.

I am looking to optimize my (software/hardware) setup in a way, that allows to speed up things. I would be interested in kind of a ‘good/best practice’ guide.
I am on windows 11, using mkvmerge 87.0 64 bit currently.

My first approach was to use parallel instances of mkvmerge, each working on one @json file. While this works, speed is (frustatingly) degrading a lot.

Looks like I/O to the (internal) hard-drive is the bottleneck (according to process-manager/task-manager). It holds source as well as target files. Eliminating all reads by caching the source files completly in RAM (zero read I/Os to the hard-drive) does not solve the problem, though. Write I/O is still degrading a lot, as soon as more than one mkvmerge process is working.

Is it generally a bad idea to have more than one mkvmerge process running at the same time?

Or does anybody have a hint for me, how to proceed? Perhaps drawn from experience, e.g. using ssd’s instead of harddrives, or distributing result files to several physical locations, or using virtual machines, or … )

Best regards and thanks in advance

In general multiple mkvmerge instances can run in parallel just fine as long as none of the output files overlap with other output or input files (it’d be fine to have to or more instances read the same file simultaneously, though). They don’t affect each other’s operation safe for the amount of disk I/O, memory bandwidth & CPU power each one uses.

mkvmerge is known not to be the most efficient wrt. to memory transfers. What I mean by this is that it copies data internally more often than might strictly be necessary. Unfortunately this is not easy to fix at all, and all long-hanging fruits have already been picked. This is why the speed of mkvmerge is often or always lower than comparable operations with e.g. ffmpeg.

The usual bottlenecks are disk I/O speed, especially when reading from or writing to spinning disks instead of SSDs/NVMe drives. There’s not really a lot you can do about this apart from using SSDs/NVMes. If you upgrade to SSDs/NVMes then memory bandwidth (and by extension CPU usage) often becomes the bottleneck, though this also depends on the type of SSDs/NVMes.

In general the question you should ask yourself is: how much time could you really save under ideal conditions? Let’s assume remuxing a single file takes you 20 minutes, and you have to do about a hundred of them. Let’s further assume that you can save ¾ of that time (down to 5min) if files were on an ideal setup. This would mean you could save 15 * 100 minutes or 1.500 minutes of time. Now contrast that with the amount of both time & money you’d have to spend to get to that ideal system:

  • cost of upgrading all your storage & maybe memory/CPU
  • time needed for setting up that system
  • time needed for copying the files over to that new system
  • doing test runs comparing before/after speeds to be able to guage how much you actually saved

Then ask yourself — is it really worth it?

Just run two or three instances in parallel, which should max out your HDD (maybe even one suffices, but I guess you can get some gains from going from one to two at least) & accept that it just takes time. Then spend the time & money you’ve saved not chasing that mythological ideal system for MKVToolNix on stuff you actually enjoy.

Don’t get me wrong: I’m all in favor of taking easy gains; unfortunately I don’t have any on offer.

Thanks for the quick response!
Actually I certainly agree, optimization always need to be set in context. If there are no quick wins … I will not spend more time on this topic. The next PC system is going to come sometime anyway I guess :wink:.
In fact … with my current setup even going from one to two instances is cutting overall speed down to roughly a third, leaving cpu usage close to zero. According to what I understand from your technical explanation it will have to do something with my hard-disk setup (which I will not change for the time being) so the batches will just take a little more time.
Have a good weekend

You’re welcome.

The problem with spinning disks is that simultaneous access to different parts of the disk totally thrashes the speed due to the latency involved in positioning the read head & having to rotate half a round (on average) before anything can be read. Things are much better if you have a single program reading something big from a single location as large chunks of data are usually located in the same location, requiring much fewer head seeking/empty rotation operations.

That’s the huge win with SSD/NVMes — not just is the sustained transfer rate higher, the latency is multiple orders of magnitude lower.

No matter what you optimize your new PC for, I highly, highly suggest you invest into SSDs/NVMes.

For sure I will do this - currently smaller ssd‘s serve as accelerators for tasks handling smaller files for me. Video storage holding multi-terabytes is still residing on spinning discs, though. nvme‘s are getting more affordable these days, so there is hope.

Thanks again

That’s good to hear, and I totally get the “price/TB” argument for mass storage. I’m not sure if I understand what you meant when you said that you already use SSD for accelerators. If you don’t already, I highly suggest you should install your OS & productivity apps on your SSDs, too. This provides a very noticeable boost in reactivity of your whole system. Just some food for thought.

just a quick update (perhaps useful to others facing performance issues, too):
In the end it turns out, that the whole topic had nothing to do with mkvmerge whatsoever.
It was a simple (but nasty) hardware issue. While the harddisc setup looked completly ok according to crystaldiscinfo and somewhat reasonable according to crystaldiscmark - there was a connection problem with one sata cable. In fact, the final hint came from a (non-critical) amount of CRC-errors.
Replacing the cable solved the performance issues and yields a much improved parallel execution performance.

So -life is good :slight_smile: