Quality issues with liby4m #13

Closed
opened 2023-06-17 01:57:44 -07:00 by NRK · 3 comments

I was looking at the y4m loader and saw it doing magic check with strncmp without checking for the file size. I don't think the filesize is guaranteed to be >10bytes, so that looked like a possible buffer overflow.

Attached a patch checking for the file size and using memcmp for magic check.

However the patch is entirely untested since my distro doesn't have liby4m packaged. So I went upstream and just glancing at the code, I see several quality issues:

  1. It unconditionally prints to stderr. This is quite bad, for example nsxiv offers a --quiet flag to disable warnings, but liby4m will continue printing anyways. (And even in general, libraries shouldn't be doing any printing anyways. It might be a GUI application with no meaningful stderr).

  2. Unchecked mallocs: Ideally libraries should report allocation failure back, instead of segfaulting on null-deref.

  3. Possible buffer overflow due to fscanf "%s". (mallocing 10 bytes is also idiotic).

And overall, basing the entire API on top of stdio is also fairly shabby.
(And I also noticed now that in the y4m loaders, the file should be opened with "rb" to avoid any weird issues on windows and other platform).

Is there no better library for loading y4m that doesn't suffer from these issues?

I was looking at the y4m loader and saw it doing magic check with `strncmp` without checking for the file size. I don't think the filesize is guaranteed to be >10bytes, so that looked like a possible buffer overflow. Attached a patch checking for the file size and using memcmp for magic check. However the patch is entirely untested since my distro doesn't have liby4m packaged. So I went upstream and just glancing at the code, I see several quality issues: 1. [It unconditionally prints to stderr](https://github.com/matthewharvey/liby4m/blob/edb15f0d36293f848b35cca84a0aedabff8c33e4/src/parse.c#L57). This is quite bad, for example `nsxiv` offers a `--quiet` flag to disable warnings, but liby4m will continue printing anyways. (And even in general, libraries shouldn't be doing any printing anyways. It might be a GUI application with no meaningful stderr). 2. [Unchecked mallocs](https://github.com/matthewharvey/liby4m/blob/edb15f0d36293f848b35cca84a0aedabff8c33e4/src/parse.c#L83): Ideally libraries should report allocation failure back, instead of segfaulting on null-deref. 3. [Possible buffer overflow due to `fscanf "%s"`](https://github.com/matthewharvey/liby4m/blob/edb15f0d36293f848b35cca84a0aedabff8c33e4/src/parse.c#L188). (mallocing 10 bytes is also idiotic). And overall, basing the entire API on top of stdio is also fairly shabby. (And I also noticed now that in the y4m loaders, the file should be opened with "rb" to avoid any weird issues on windows and other platform). Is there no better library for loading y4m that doesn't suffer from these issues?
Author

The y4m format doesn't seem too complicated. I might give it a go since I'm fairly free the next ~2weeks or so :)

[The y4m format](https://wiki.multimedia.cx/index.php/YUV4MPEG2) doesn't seem *too* complicated. I might give it a go since I'm fairly free the next ~2weeks or so :)
Owner

Patch pushed.

I'm not aware of alternatives to the liby4m you mention.

I haven't looked much into the code but I agree with your concerns.
I'm also not entirely happy about that it doesn't load from memory like almost all other loaders.
And it would be nice to not rely on a library that AFAIK is not generally available in distros.
So feel free to remedy that :)

Patch pushed. I'm not aware of alternatives to the liby4m you mention. I haven't looked much into the code but I agree with your concerns. I'm also not entirely happy about that it doesn't load from memory like almost all other loaders. And it would be nice to not rely on a library that AFAIK is not generally available in distros. So feel free to remedy that :)
Author

And it would be nice to not rely on a library that AFAIK is not generally available in distros.

Going by repology, it's not packaged in a single distro. libyuv seems decently pacakged.

However, my distro still lacks libyuv as well. So I grabbed it from a 3rd party repo. Unfortunately that 3rd party build seems to be broken because dlopen() kept failing due to undefined symbols.

I attached a patch adding some debug logs + an allocation check. (Gitea doesn't allow attaching .patch files, so I had to rename them to .txt ... annoying).

So feel free to remedy that :)

I have a somewhat working y4m parser, it's able to load all the test images in this repo. It's still using libyuv (I grabbed an updated build from a different 3rd party repo), but since we only need the *ToARGB() functions, I might look into rolling those as well :)

For now, I'd like to polish up the parser - improve edge case handling, look at what other parsers do in weird cases, add fuzz testing etc. I think adding support for multi-frame y4m shouldn't be too difficult either, so I'll probably do that as well.

I'll send you the patch once everything is polished up :)

> And it would be nice to not rely on a library that AFAIK is not generally available in distros. Going by repology, it's not packaged in a single distro. libyuv [seems decently pacakged](https://repology.org/project/libyuv/versions). However, my distro still lacks libyuv as well. So I grabbed it from a 3rd party repo. Unfortunately that 3rd party build seems to be broken because `dlopen()` kept failing due to undefined symbols. I attached a patch adding some debug logs + an allocation check. (Gitea doesn't allow attaching `.patch` files, so I had to rename them to `.txt` ... annoying). > So feel free to remedy that :) I have a somewhat working y4m parser, it's able to load all the test images in this repo. It's still using libyuv (I grabbed an updated build from a different 3rd party repo), but since we only need the `*ToARGB()` functions, I might look into rolling those as well :) For now, I'd like to polish up the parser - improve edge case handling, look at what other parsers do in weird cases, add fuzz testing etc. I think adding support for multi-frame y4m shouldn't be too difficult either, so I'll probably do that as well. I'll send you the patch once everything is polished up :)
kw closed this issue 2023-06-21 10:06:59 -07:00
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: old/legacy-imlib2#13
No description provided.