Ian Jackson wrote:
Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle
terminating signals."):
The race I know of is that you may get an aio signal completion before
select but after you've already qemu_aio_poll()'d. In practice, we only
sleep for 10ms at a time in select() so the race is handled by that. If
we wanted to increase the amount of time we slept, we would have to
handle this race.
Yes. And, 10ms is too long anyway for reasonable performance. During
my merge with upstream I found that the qemu aio functionality (which
was done quite differently to the old xen ioemu) caused a severe
performance regression under some conditions because of this race.
Yeah, noticed that too, especially with qcow2.
In KVM, we sleep for 1s in select() and use signalfd() to receive the
aio notifications. For older hosts, we emulate signalfd using a thread
and the pipe-to-self trick.
Why does it need a thread ? You can just write to the pipe in the
signal handler. I'll post my code.
It's a little less perfect. Your signal handler can write() to the pipe
but what happens if you get EAGAIN?
So what we do in KVM is use sigwait() within a separate thread. We
don't set O_NONBLOCK so the thread blocks if the pipe fills up which is
exactly the semantics you would want.
Below is our implementation. I'll queue up to push this change into
QEMU after I finish with the migration patches.
Regards,
Anthony Liguori
#include <sys/syscall.h>
#include <pthread.h>
struct sigfd_compat_info
{
sigset_t mask;
int fd;
};
static void *sigwait_compat(void *opaque)
{
struct sigfd_compat_info *info = opaque;
int err;
sigset_t all;
sigfillset(&all);
sigprocmask(SIG_BLOCK, &all, NULL);
do {
siginfo_t siginfo;
err = sigwaitinfo(&info->mask, &siginfo);
if (err == -1 && errno == EINTR) {
err = 0;
continue;
}
if (err > 0) {
char buffer[128];
size_t offset = 0;
memcpy(buffer, &err, sizeof(err));
while (offset < sizeof(buffer)) {
ssize_t len;
len = write(info->fd, buffer + offset,
sizeof(buffer) - offset);
if (len == -1 && errno == EINTR)
continue;
if (len <= 0) {
err = -1;
break;
}
offset += len;
}
}
} while (err >= 0);
return NULL;
}
static int kvm_signalfd_compat(const sigset_t *mask)
{
pthread_attr_t attr;
pthread_t tid;
struct sigfd_compat_info *info;
int fds[2];
info = malloc(sizeof(*info));
if (info == NULL) {
errno = ENOMEM;
return -1;
}
if (pipe(fds) == -1) {
free(info);
return -1;
}
memcpy(&info->mask, mask, sizeof(*mask));
info->fd = fds[1];
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
pthread_create(&tid, &attr, sigwait_compat, info);
pthread_attr_destroy(&attr);
return fds[0];
}
int kvm_signalfd(const sigset_t *mask)
{
#if defined(SYS_signalfd)
int ret;
ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8);
if (!(ret == -1 && errno == ENOSYS))
return ret;
#endif
return kvm_signalfd_compat(mask);
}
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|