Xen project Mailing List

Re: xenstored file descriptor leak

From: Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>

Date: Wed, 3 Feb 2021 08:57:21 +0100

Delivery-date: Wed, 03 Feb 2021 07:57:34 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Feb 03, 2021 at 07:18:51AM +0100, Jürgen Groß wrote: > On 02.02.21 19:37, Manuel Bouyer wrote: > > Hello, > > on NetBSD I'm tracking down an issue where xenstored never closes its > > file descriptor to /var/run/xenstored/socket and instead loops at 100% > > CPU on these descriptors. > > > > xenstored loops because poll(2) returns a POLLIN event for these > > descriptors but they are marked is_ignored = true. > > > > I'm seeing this with xen 4.15, 4.13 and has also been reported with 4.11 > > with latest security patches. > > It seems to have started with patches for XSA-115 (A user reported this > > for 4.11) > > > > I've tracked it down to a difference in poll(2) implementation; it seems > > that linux will return something that is not (POLLIN|POLLOUT) when a > > socket if closed; while NetBSD returns POLLIN (this matches the NetBSD's > > man page). > > Yeah, Linux seems to return POLLHUP additionally. Overall, it seems that the poll(2) behavior with non-regular files is highly OS-dependant when it comes to border cases (like a remote close of a socket) > > > > > First I think there may be a security issue here, as, even on linux it > > should > > be possible for a client to cause a socket to go to the is_ignored state, > > while still sending data and cause xenstored to go to a 100% CPU loop. > > No security issue, as sockets are restricted to dom0 and user root. > > In case you are suspecting a security issue, please don't send such > mails to xen-devel in future, but to security@xxxxxxxxxxxxxxxxxxxx. Yes, sorry. Will do next time. > > > I think this is needed anyway: > > > > --- xenstored_core.c.orig 2021-02-02 18:06:33.389316841 +0100 > > +++ xenstored_core.c 2021-02-02 19:27:43.761877371 +0100 > > @@ -397,9 +397,12 @@ > > !list_empty(&conn->out_list))) > > *ptimeout = 0; > > } else { > > - short events = POLLIN|POLLPRI; > > - if (!list_empty(&conn->out_list)) > > - events |= POLLOUT; > > + short events = 0; > > + if (!conn->is_ignored) { > > + events |= POLLIN|POLLPRI; > > + if (!list_empty(&conn->out_list)) > > + events |= POLLOUT; > > + } > > conn->pollfd_idx = set_fd(conn->fd, events); > > } > > } > > Yes, I think this is a good idea. Well, after some sleep I don't think it is. We should always keep at last POLLIN or we will never notice a socket close otherwise. > > > > > Now I wonder if, on NetBSD at last, a read error or short read shouldn't > > cause the socket to be closed, as with: > > > > @@ -1561,6 +1565,8 @@ > > bad_client: > > ignore_connection(conn); > > + /* we don't want to keep this connection alive */ > > + talloc_free(conn); > > } > > This is wrong for non-socket connections, as we want to keep the domain > in question to be known to xenstored. > > For socket connections this should be okay, though. What are "non-socket connections" BTW ? I don't think I've seen one in my test. Is there a way to know if a connection is socket or non-socket ? -- Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx> NetBSD: 26 ans d'experience feront toujours la difference --

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.