I am just getting started ripping Blu-ray discs to backup my collection. I am trying to use mkvmerge to write the chapter data from the playlist into the MKV. When I do, the output file is about half the size of the input file.
I decided to simply “copy” the input like so.
mkvmerge -o copy.mkv original.mkv
The effect appears here too. So for now, I’m going to discard the chapter discussion, I just want to focus on the file size difference.
For reference, the input file is 1.5 GiB, the output file is 697.8 MiB.
I haven’t found any way to track down where the missing bytes have gone.
These are the routes I have pursued.
- The input file has four tracks. Each has one tag. “mkvmerge -i” reports no tags on the output. Could there by 800 MiB in those four tags? I doubt it, but I haven’t been able to confirm it. ffprobe only shows a “Duration” tag on the original, but that is present on the copy as well, so I don’t know what the tag is.
- I found the other thread here about decreased file sizes, but disabling compression yields an output file of 699.5 MiB. That’s not likely to be the issue.
- If I repeat the process on the copy, the output appears to be identical. It’s not throwing away half the data each time, there’s something about the input.
- I checked the stream statistics in VLC player and confirmed that the total bitrate is about half of the original. I can’t see a difference.
- I am also planning to re-encode the video to a more efficient codec for storage. When I do this, I get a bunch of “Error submitting packet to decoder: Invalid data found when processing input” and “Not a valid DCA frame” errors, but the output works perfectly. When I use my encode command on the copy, the output appears to be the same,but I don’t get those errors. Is it possible the original has 800 MiB of junk data? How could I confirm that?
- In theory, if the stream is unchanged, the decoded frames should be byte-for-byte identical, right? Is there a way I could check this?
I’m kind of out of my depth here, I don’t know what to do next. My intuition says everything is fine. There’s no way that mkvmerge is re-encoding anything and visually, it looks identical. But I am deeply curious here, and there is a small nagging doubt that something is amiss.
Can anyone point me in the right direction?