imlib2 doesn't handle EINTR #5

Closed
opened 2023-01-30 23:53:00 -08:00 by guijan · 12 comments
Contributor

EINTR is a recoverable error, but imlib2 hard errors whenever it's encountered.

Here's one instance with write() which is also present in other places, use grep:
5c226954c1/src/modules/loaders/loader_bz2.c (L39-L40)

Here's one instance with fopen(), also present in other places:
5c226954c1/src/lib/image.c (L107-L109)

Here's a more complicated case, opendir() isn't defined to return EINTR by POSIX, but on OpenBSD it can:
5c226954c1/src/lib/file.c (L211-L216)
http://man.openbsd.org/opendir.3

EINTR is a recoverable error, but imlib2 hard errors whenever it's encountered. Here's one instance with write() which is also present in other places, use grep: https://git.enlightenment.org/old/legacy-imlib2/src/commit/5c226954c114f1ced7027498fd2debd34bea08f2/src/modules/loaders/loader_bz2.c#L39-L40 Here's one instance with fopen(), also present in other places: https://git.enlightenment.org/old/legacy-imlib2/src/commit/5c226954c114f1ced7027498fd2debd34bea08f2/src/lib/image.c#L107-L109 Here's a more complicated case, opendir() isn't defined to return EINTR by POSIX, but on OpenBSD it can: https://git.enlightenment.org/old/legacy-imlib2/src/commit/5c226954c114f1ced7027498fd2debd34bea08f2/src/lib/file.c#L211-L216 http://man.openbsd.org/opendir.3

I think the dir needs to be closed here, too: 5c226954c1/src/lib/file.c (L226-L229)

I think the dir needs to be closed here, too: https://git.enlightenment.org/old/legacy-imlib2/src/commit/5c226954c114f1ced7027498fd2debd34bea08f2/src/lib/file.c#L226-L229
Owner

@guijan
In theory it looks like you are right. However, I have not been able to code something which demonstrates the problem (provoke EINTR errors) on my linux box.
If you can provide such an example I'll take a look at it. Needs to be on linux, I have no xyzBSD.
This issue has to my knowledge not come up before in the lifetime of imlib2 (20+ years), so I'd like an actual test case before starting to fix it.
And how about EAGAIN - shouldn't there be similar problems with that?

@guijan In theory it looks like you are right. However, I have not been able to code something which demonstrates the problem (provoke EINTR errors) on my linux box. If you can provide such an example I'll take a look at it. Needs to be on linux, I have no xyzBSD. This issue has to my knowledge not come up before in the lifetime of imlib2 (20+ years), so I'd like an actual test case before starting to fix it. And how about EAGAIN - shouldn't there be similar problems with that?
Owner

@NRK Should be fixed now.

@NRK Should be fixed now.
Author
Contributor

EINTR on Linux and OpenBSD is only returned if the underlying medium is considered blocking, blocking means syscalls that use the fd could theoretically pause forever. The disk isn't considered blocking, pipes and FIFOs are, mounted network storage should be (I haven't checked). Even if you receive an interrupt while a syscall is operating on a fd backed by a medium that isn't considered blocking, it won't return. I don't know about other unices; the other BSDs are probably similar, and I don't know how Illumos, HaikuOS, Cygwin, and the proprietary unices behave.

Here's a program that triggers EINTR for write():
eintr.c:

#define _GNU_SOURCE

#include <sys/time.h>

#include <err.h>
#include <errno.h>
#include <fcntl.h>
#include <signal.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

static unsigned char buf[32768];

static void do_nothing(int);
static int childpipe(void);

/* eintr: Try to trigger EINTR */
int
main(void)
{
	struct itimerval it;
	struct sigaction act;
	int fd;

	if ((fd = childpipe()) == -1)
		err(1, "childpipe");
	memset(buf, 0xFF, sizeof(buf));

	/* Catch the SIGALRM and do nothing with it so syscalls are interrupted.
	 */
	act.sa_handler = do_nothing;
	sigemptyset(&act.sa_mask);
	act.sa_flags = 0;
	sigaction(SIGALRM, &act, NULL);

	/* Rapid fire SIGALRM. */
	it.it_interval.tv_usec = 1;
	it.it_interval.tv_sec = 0;
	it.it_value = it.it_interval;
	setitimer(ITIMER_REAL, &it, NULL);

	/* Write nonsense to the pipe forever and watch for interrupts. */
	for (;;) {
		ssize_t r;
		r = write(fd, buf, sizeof(buf));
		if (r == -1) {
			warn("write() returned -1, errno == %d", errno);
		} else if (r != sizeof(buf)) {
			warnx("write() returned %zu (short write)", (size_t)r);
		}
	}
}

static void
do_nothing(int sig)
{
}

/* childpipe: return a pipe that writes to a child.
 *
 * The child reads and does nothing forever.
 */
static int
childpipe(void)
{
	int fds[2];

	if (pipe(fds) == -1)
		return -1;

	switch(fork()) {
	case -1:
		goto err;
	case 0:
		close(fds[1]); /* Close the writer, we're the child. */
		for (;;)
			read(fds[0], buf, sizeof(buf));
		_exit(0); /* Can't happen. Child always SIGPIPEs. */
	default:
		close(fds[0]); /* Close the reader, we're the parent. */
		break;
	}

	return fds[1];
err:
	close(fds[0]);
	close(fds[1]);
	return -1;
}

Here's an example invocation of that program:

$ cc eintr.c
$ ./a.out
a.out: write() returned -1, errno == 4: Interrupted system call
a.out: write() returned -1, errno == 4: Interrupted system call
a.out: write() returned -1, errno == 4: Interrupted system call

Here's a program that triggers EINTR inside an imlib function:
imlib2-eintr.c:

#define _GNU_SOURCE

#include <sys/stat.h>
#include <sys/time.h>

#include <err.h>
#include <errno.h>
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <Imlib2.h>

static char fifoname[L_tmpnam];

static void delfifo(void);
static void do_nothing(int);

/* imlib2-eintr: make imlib2 fail with EINTR. */
int
main(void)
{
	enum {
		width = 256,
		height = 256,
	};
	uint32_t *pixels;
	Imlib_Image image;
	size_t i;
	struct sigaction act;
	struct itimerval it;
	Imlib_Load_Error error_return;

	/*
	 * Create the image.
	 */
	if ((pixels = calloc(width*height, sizeof(*pixels))) == NULL)
		err(1, "calloc");
	for (i = 0; i < width*height; i++)
		pixels[i] = 0x00FFFFFF; /* Pure black, no transparency. */
	image = imlib_create_image_using_data(width, height, pixels);
	if (image == NULL)
		errx(1, "imlib_create_image_using_data failed");
	imlib_context_set_image(image);
	imlib_image_set_format("png");

	/*
	 * Create a FIFO.
	 */
	if (tmpnam(fifoname) == NULL)
		err(1, "tmpnam");
	atexit(delfifo);
	if (mkfifo(fifoname, 0777) == -1)
		err(1, "mkfifo");

	/*
	 * Set up a periodic SIGALRM so we're interrupted.
	 */
	act.sa_handler = do_nothing;
	sigemptyset(&act.sa_mask);
	act.sa_flags = 0;
	sigaction(SIGALRM, &act, NULL);
	it.it_interval.tv_usec = 500000; /* 500ms. */
	it.it_interval.tv_sec = 0;
	it.it_value = it.it_interval;
	setitimer(ITIMER_REAL, &it, NULL);

	/*
	 * Try saving the image to the FIFO's path. Opening a FIFO blocks until
	 * the FIFO is opened with read and write permissions. Nobody will open
	 * it with read permission, so we block indefinitely, and eventually,
	 * our timer sends us a SIGALRM and open() returns EINTR.
	 */
	errno = 0;
	imlib_save_image_with_error_return(fifoname, &error_return);
	if (errno || error_return)
		err(1, "imlib_save_image_with_error_return: %d", error_return);

	imlib_free_image_and_decache();
	free(pixels);
	return 0;
}

static void
delfifo(void)
{
	unlink(fifoname);
}

static void
do_nothing(int sig)
{
}

Here's an example invocation of that program:

$ cc $(pkg-config --cflags --libs imlib2) imlib2-eintr.c
$ ./a.out
a.out: imlib_save_image_with_error_return: 14: Interrupted system call

EAGAIN is a thing, yes. I've heard some hearsay that the Linux sockets code does return EAGAIN. I haven't seen any code handle EAGAIN out there or run into it myself, so I'm not sure how to proceed.

EINTR on Linux and OpenBSD is only returned if the underlying medium is considered blocking, blocking means syscalls that use the fd could theoretically pause forever. The disk isn't considered blocking, pipes and FIFOs are, mounted network storage should be (I haven't checked). Even if you receive an interrupt while a syscall is operating on a fd backed by a medium that isn't considered blocking, it won't return. I don't know about other unices; the other BSDs are probably similar, and I don't know how Illumos, HaikuOS, Cygwin, and the proprietary unices behave. Here's a program that triggers EINTR for write(): __eintr.c:__ ```c #define _GNU_SOURCE #include <sys/time.h> #include <err.h> #include <errno.h> #include <fcntl.h> #include <signal.h> #include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> static unsigned char buf[32768]; static void do_nothing(int); static int childpipe(void); /* eintr: Try to trigger EINTR */ int main(void) { struct itimerval it; struct sigaction act; int fd; if ((fd = childpipe()) == -1) err(1, "childpipe"); memset(buf, 0xFF, sizeof(buf)); /* Catch the SIGALRM and do nothing with it so syscalls are interrupted. */ act.sa_handler = do_nothing; sigemptyset(&act.sa_mask); act.sa_flags = 0; sigaction(SIGALRM, &act, NULL); /* Rapid fire SIGALRM. */ it.it_interval.tv_usec = 1; it.it_interval.tv_sec = 0; it.it_value = it.it_interval; setitimer(ITIMER_REAL, &it, NULL); /* Write nonsense to the pipe forever and watch for interrupts. */ for (;;) { ssize_t r; r = write(fd, buf, sizeof(buf)); if (r == -1) { warn("write() returned -1, errno == %d", errno); } else if (r != sizeof(buf)) { warnx("write() returned %zu (short write)", (size_t)r); } } } static void do_nothing(int sig) { } /* childpipe: return a pipe that writes to a child. * * The child reads and does nothing forever. */ static int childpipe(void) { int fds[2]; if (pipe(fds) == -1) return -1; switch(fork()) { case -1: goto err; case 0: close(fds[1]); /* Close the writer, we're the child. */ for (;;) read(fds[0], buf, sizeof(buf)); _exit(0); /* Can't happen. Child always SIGPIPEs. */ default: close(fds[0]); /* Close the reader, we're the parent. */ break; } return fds[1]; err: close(fds[0]); close(fds[1]); return -1; } ``` Here's an example invocation of that program: ```console $ cc eintr.c $ ./a.out a.out: write() returned -1, errno == 4: Interrupted system call a.out: write() returned -1, errno == 4: Interrupted system call a.out: write() returned -1, errno == 4: Interrupted system call ``` Here's a program that triggers EINTR inside an imlib function: __imlib2-eintr.c__: ```c #define _GNU_SOURCE #include <sys/stat.h> #include <sys/time.h> #include <err.h> #include <errno.h> #include <signal.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <Imlib2.h> static char fifoname[L_tmpnam]; static void delfifo(void); static void do_nothing(int); /* imlib2-eintr: make imlib2 fail with EINTR. */ int main(void) { enum { width = 256, height = 256, }; uint32_t *pixels; Imlib_Image image; size_t i; struct sigaction act; struct itimerval it; Imlib_Load_Error error_return; /* * Create the image. */ if ((pixels = calloc(width*height, sizeof(*pixels))) == NULL) err(1, "calloc"); for (i = 0; i < width*height; i++) pixels[i] = 0x00FFFFFF; /* Pure black, no transparency. */ image = imlib_create_image_using_data(width, height, pixels); if (image == NULL) errx(1, "imlib_create_image_using_data failed"); imlib_context_set_image(image); imlib_image_set_format("png"); /* * Create a FIFO. */ if (tmpnam(fifoname) == NULL) err(1, "tmpnam"); atexit(delfifo); if (mkfifo(fifoname, 0777) == -1) err(1, "mkfifo"); /* * Set up a periodic SIGALRM so we're interrupted. */ act.sa_handler = do_nothing; sigemptyset(&act.sa_mask); act.sa_flags = 0; sigaction(SIGALRM, &act, NULL); it.it_interval.tv_usec = 500000; /* 500ms. */ it.it_interval.tv_sec = 0; it.it_value = it.it_interval; setitimer(ITIMER_REAL, &it, NULL); /* * Try saving the image to the FIFO's path. Opening a FIFO blocks until * the FIFO is opened with read and write permissions. Nobody will open * it with read permission, so we block indefinitely, and eventually, * our timer sends us a SIGALRM and open() returns EINTR. */ errno = 0; imlib_save_image_with_error_return(fifoname, &error_return); if (errno || error_return) err(1, "imlib_save_image_with_error_return: %d", error_return); imlib_free_image_and_decache(); free(pixels); return 0; } static void delfifo(void) { unlink(fifoname); } static void do_nothing(int sig) { } ``` Here's an example invocation of that program: ```console $ cc $(pkg-config --cflags --libs imlib2) imlib2-eintr.c $ ./a.out a.out: imlib_save_image_with_error_return: 14: Interrupted system call ``` EAGAIN is a thing, yes. I've heard some hearsay that the Linux sockets code does return EAGAIN. I haven't seen any code handle EAGAIN out there or run into it myself, so I'm not sure how to proceed.
Owner

Right, writing to a pipe :) It didn't cross my mind to check that path.
I tend to focus on loaders and forget about savers.
I'll fix writing to pipes in the savers.

All write() calls (e.g. the one in the bz2 loader) are to files created by mkstemp() in /tmp. I'd expect the underlying medium there to be non-blocking (RAM or local disk). So that should fail only on very peculiar setups, I think.

As things are, imlib2 will only load images from mmapable files, so if (in image.c) we should succeed in fopen()ing a pipe, mmap() will fail shortly after. Maybe it should be prevented to go there entirely by checking if we are dealing with a pipe (like it already is checked that it isn't a directory).

opendir() etc. is only called when loading modules (or fonts).
If there are pipes in that path then IMO the system is broken.
So maybe there could be a potential issue if network file systems are involved?
I don't know. I also think that is a highly unlikely scenario in the real world today.

So, for cases other than saving to a pipe I'll again await an actual test case.

Thanks for your input so far :)

Right, writing to a pipe :) It didn't cross my mind to check that path. I tend to focus on loaders and forget about savers. I'll fix writing to pipes in the savers. All write() calls (e.g. the one in the bz2 loader) are to files created by mkstemp() in /tmp. I'd expect the underlying medium there to be non-blocking (RAM or local disk). So that should fail only on very peculiar setups, I think. As things are, imlib2 will only load images from mmapable files, so if (in image.c) we should succeed in fopen()ing a pipe, mmap() will fail shortly after. Maybe it should be prevented to go there entirely by checking if we are dealing with a pipe (like it already is checked that it isn't a directory). opendir() etc. is only called when loading modules (or fonts). If there are pipes in that path then IMO the system is broken. So maybe there could be a potential issue if network file systems are involved? I don't know. I also think that is a highly unlikely scenario in the real world today. So, for cases other than saving to a pipe I'll again await an actual test case. Thanks for your input so far :)
Author
Contributor

https://linux.die.net/man/5/nfs (ctrl+f noeintr)
That manual says making NFS return EINTR was an optional and nondefault option in Linux kernels older than version 2.6.25, and that 2.6.25 and later versions never return EINTR.

https://manpages.ubuntu.com/manpages/bionic/man4/fuse.4.html (ctrl+f FUSE_INTERRUPT)
The FUSE API seems to leave continuing or returning EINTR or a short read/write up to the individual filesystem.

I don't think poking into the current implementation details of one kernel and writing programs according to those details and a particular set of use cases is the right way to do things. Handling EINTR where POSIX or OS-specific documentation mentions it is the right approach. I don't have a test case.

https://linux.die.net/man/5/nfs (ctrl+f noeintr) That manual says making NFS return EINTR was an optional and nondefault option in Linux kernels older than version 2.6.25, and that 2.6.25 and later versions never return EINTR. https://manpages.ubuntu.com/manpages/bionic/man4/fuse.4.html (ctrl+f FUSE_INTERRUPT) The FUSE API seems to leave continuing or returning EINTR or a short read/write up to the individual filesystem. I don't think poking into the current implementation details of one kernel and writing programs according to those details and a particular set of use cases is the right way to do things. Handling EINTR where POSIX or OS-specific documentation mentions it is the right approach. I don't have a test case.

I don't think poking into the current implementation details of one kernel and writing programs according to those details and a particular set of use cases is the right way to do things. Handling EINTR where POSIX or OS-specific documentation mentions it is the right approach.

While I agree in general - other than the write I don't think it's worth dealing with EINTR on the other cases - it's just busywork.

And EINTR in specific is an insane headace - most kernels seem to have done the "right thing" by making EINTR as rare as possible. Here's one of my faviourite - the fact that every single close call is "incorrect" under current POSIX. From the linux manpage:

The EINTR error is a somewhat special case.  Regarding the EINTR error, POSIX.1-2008 says:

     If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set  to  EINTR and the state of fildes is unspecified.

This permits the behavior that occurs on Linux and many other implementations, where, as with other errors that may be reported by close(), the file descriptor is guaranteed to be closed.  However, it also  permits  another possibility:  that the implementation returns an EINTR error and keeps the file descriptor open.  (According to its documentation, HP-UX's close() does this.)  The caller must then once more use close() to  close  the  file descriptor,  to  avoid file descriptor leaks.  This divergence in implementation behaviors provides a difficult hurdle for portable applications, since on many implementations, close() must not  be  called  again  after  an EINTR  error, and on at least one, close() must be called again.
> I don't think poking into the current implementation details of one kernel and writing programs according to those details and a particular set of use cases is the right way to do things. Handling EINTR where POSIX or OS-specific documentation mentions it is the right approach. While I agree in general - other than the `write` I don't think it's worth dealing with `EINTR` on the other cases - it's just busywork. And EINTR in specific is an insane headace - most kernels seem to have done the "right thing" by making EINTR as rare as possible. Here's one of my faviourite - the fact that every single `close` call is "incorrect" under current POSIX. From the [linux manpage](https://www.man7.org/linux/man-pages/man2/close.2.html#NOTES): ``` The EINTR error is a somewhat special case. Regarding the EINTR error, POSIX.1-2008 says: If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to EINTR and the state of fildes is unspecified. This permits the behavior that occurs on Linux and many other implementations, where, as with other errors that may be reported by close(), the file descriptor is guaranteed to be closed. However, it also permits another possibility: that the implementation returns an EINTR error and keeps the file descriptor open. (According to its documentation, HP-UX's close() does this.) The caller must then once more use close() to close the file descriptor, to avoid file descriptor leaks. This divergence in implementation behaviors provides a difficult hurdle for portable applications, since on many implementations, close() must not be called again after an EINTR error, and on at least one, close() must be called again. ```
Owner

EINTR is now handled in connection with fopen() in savers (and loaders).
With this I see no problems with EINTR.

In theory there may well be more problems, but I'm not convinced that there are in practice.
Until I am convinced that there are actual problems I'm not going to try to fix things that may not be a problem.
But I probably won't object much to others doing so :)

EINTR is now handled in connection with fopen() in savers (and loaders). With this I see no problems with EINTR. In theory there may well be more problems, but I'm not convinced that there are in practice. Until I am convinced that there are actual problems I'm not going to try to fix things that may not be a problem. But I probably won't object much to others doing so :)

Could just set SA_RESTART to avoid all the pain. Personally, I use the following function at the top of my mains: https://git.sr.ht/~q3cpma/misc-tools/tree/master/item/misc.c#L45

Could just set SA_RESTART to avoid all the pain. Personally, I use the following function at the top of my mains: https://git.sr.ht/~q3cpma/misc-tools/tree/master/item/misc.c#L45

Could just set SA_RESTART to avoid all the pain.

That's something that should be done on the application side. Altering global state behind the user's back is NOT something I like to see in libraries.
(More generally, I prefer libraries that don't have any hidden global states at all. Imlib2 is guilty of it unfortunately - but it's not a dealbreaker - just a nitpick).

> Could just set SA_RESTART to avoid all the pain. That's something that should be done on the application side. Altering global state behind the user's back is NOT something I like to see in libraries. (More generally, I prefer libraries that don't have _any_ hidden global states at all. Imlib2 is guilty of it unfortunately - but it's not a dealbreaker - just a nitpick).

Right, forgot we were talking about a library for a second.

Right, forgot we were talking about a library for a second.
Owner

I'll leave things as they are for now.

I'll leave things as they are for now.
kw closed this issue 2023-04-06 00:29:17 -07:00
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: old/legacy-imlib2#5
No description provided.