Modifying an audio track then adding it back - any specifics I need to know?

I’m starting a project to take an MKV file, rip the audio out, modify the audio, and then add it back in. I’m using a 5.1 (6 channel) WAV file for the audio. The timing will not be any different, the bit rate and depth will not be any different, the full length will not be any different. I’m replacing 30 seconds of audio with a different country’s release where a song was replaced due to the cost of licensing for release in the USA.

Do I need to make sure of anything, to I need to make sure anything is or isn’t changed? Should this just be a drop in replacement? Is there any guides to something like this?


What you’re attempting should be easy enough to do. The only thing you may have to pay attention to is how much the audio track is offset wrt. the video track — meaning the difference between the timestamps of the first video frame & the first audio frame. While extracting the audio track from Matroska this offset will be lost, and you’ll have to re-add the offset when you merge in your newly encoded track (the --sync CLI option, or its corresponding GUI option “Delay”).

You can determine the offset either by looking at the verbose output of mkvinfo (e.g. mkvinfo -s original_file.mkv), or by using mkvmerge’s JSON identification mode (e.g. mkvmerge -J original_file.mkv). The latter is the recommended route as its output (JSON) can be easily processed by other scripts (Python, Ruby, PowerShell, jq, whatever). The element you’re looking for is the minimum_timestamp property for the audio track you’re replacing. Note that units, though; the minimum_timestamp is in nanoseconds, whereas the --sync options takes a number in milliseconds.