[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v2] xSplice design



On Fri, Jun 12, 2015 at 07:31:25PM +0200, Martin Pohlack wrote:
> On 12.06.2015 16:43, Jan Beulich wrote:
> >>>> On 12.06.15 at 16:31, <mpohlack@xxxxxxxxxx> wrote:
> >> The 1ms is just a random number.  I would actually suggest to allow a
> >> sysadmin or hotpatch management tooling to specify how long one is
> >> willing to potentially block the whole machine when waiting for a
> >> stop_machine-like barrier as part of a relevant hypercall.  You could
> >> imagine userland to start out with 1ms and slowly work its way up
> >> whenever it retries.
> > 
> > In which case the question would be why it didn't start with a larger
> > timeout from the beginning. If anything I could see this to be used
> > to allow for a larger stop window for more critical patches.
> 
> The main idea is that situations where you cannot patch immediately are
> transient (e.g., instance start / stop, ...).  So by trying a couple of
> times with a very short timeout every minute or so, chances are very
> high to succeed without causing any large interruptions for guests.
> 
> Also, you usually have some time to deploy a hotpatch, given the typical
> XSA embargo period.  So by slowly increasing the maximum blocking time
> that one is willing to pay, one would patch the vast majority very
> quickly and one still would have the option to patch stragglers by
> paying a bit more blocking time later in the patch period.

The system admin would want the patch in regardless whether the mechanism
took miliseconds or seconds. Having knobs to define the timeout are not
neccessary - what the admin would most likely want to be told is:
"Hypervisor is busy, attempt #31415" instead of an silence and a hung
command.

And maybe an --pause argument if the hypervisor is really tied up and
can't get any breathing room - which would pause all the guests before
patching, and afterwards unpause them.


However I just realized one problem with causing an patching
through the hypercall (as opposed to having it done asynchronously).

We would not be able to patch all the code that is invoked while
this hypercall is in progress. That is - the do_domctl, the
spinlocks, anything put on the stack, etc.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.