[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Xen-users] "xl restore" leaks a file descriptor?
On 11/08/15 18:07, Wei Liu wrote: > On Tue, Aug 11, 2015 at 04:48:13PM +0100, Ian Campbell wrote: >> On Tue, 2015-08-11 at 11:13 -0400, Andrew Armenia wrote: >>> It's the checkpoint file - i.e. the command line argument to xl >>> restore - that is being leaked. >> Thanks. >> >> [...] >>> So the checkpoint file is clearly being leaked. >> Indeed. I confirmed this even with the current development version using ls >> -l /proc/<pid>/fd which shows an fd open on a deleted file: >> >> # ps aux| grep xl >> root 20465 0.0 0.2 106036 984 ? SLsl 15:42 0:00 xl restore >> save >> # ls -l /proc/20465/fd >> [...] >> lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save >> [...] >> # rm /root/save >> # ls -l /proc/20465/fd >> [...] >> lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save (deleted) >> [...] >> >>> Its space is not freed >>> until the 'xl restore' process is ended by shutting down the domain: >> [...] >>> It seems like xl restore should close the checkpoint file as soon as >>> it's done restoring the domain, allowing the space to be freed, but >>> that's clearly not happening. >> Right. In fact xl sets the file to be close-on-exec right after opening it, >> which is before the daemonisation step, so it ought to be closed >> automatically, but isn't for some reason. >> >> My working theory is that something in the machinery which spawns the save >> helper is defeating the use of CLOEXEC, perhaps by dup2() or perhaps by >> unsetting CLOEXEC. >> >> Any way, thanks for reporting. I've copied the devel list and 4.6 RM. Wei >> this probably ought to be a blocker for 4.6 (and the fix ought ultimately >> to be backported to 4.4 onwards at least). >> >> NB: This leak seems to be independent of the switch to migration v2. >> >> Ian. > Maybe this is just because we leak a fd. > > I don't see how CLOEXEC would be of any use if xl doesn't actually exec > anything. > > Below is a PoC patch which seems to fix the problem for me. > > ---8<--- > commit 7b5f466d5977dc9f41991ca0c2227023ac07709d > Author: Wei Liu <wei.liu2@xxxxxxxxxx> > Date: Tue Aug 11 18:02:25 2015 +0100 > > xl: close restore_fd when we finish with it > > Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx> > > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c > index 499a05c..525cd24 100644 > --- a/tools/libxl/xl_cmdimpl.c > +++ b/tools/libxl/xl_cmdimpl.c > @@ -2846,6 +2846,10 @@ start: > ret = libxl_domain_create_new(ctx, &d_config, &domid, > 0, autoconnect_console_how); > } > + > + if (migrate_fd < 0) > + close(restore_fd); > + You surely need check for restore_fd >= 0, to avoid a potential EBADF ? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |