[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 0/3] vm_event: fix race-condition when disabling monitor events
> For the last couple years we have received numerous reports from users of > monitor vm_events of spurious guest crashes when using events. In particular, > it has observed that the problem occurs when vm_events are being disabled. The > nature of the guest crash varied widely and has only occured occasionally. > This > made debugging the issue particularly hard. We had discussions about this > issue > even here on the xen-devel mailinglist with no luck figuring it out. > > The bug has now been identified as a race-condition between register event > handling and disabling the vm_event interface. > > Patch 96760e2fba100d694300a81baddb5740e0f8c0ee, "vm_event: deny register > writes > if refused by vm_event reply" is the patch that introduced the error. In this > patch emulation of register write events can be postponed until the > corresponding vm_event handler decides whether to allow such write to take > place. Unfortunately this can only be implemented by performing the deny/allow > step when the vCPU gets scheduled. Due to that postponed emulation of the > event > if the user decides to pause the VM in the vm_event handler and then disable > events, the entire emulation step is skipped the next time the vCPU is > resumed. > Even if the user doesn't pause during the vm_event handling but exits > immediately and disables vm_event, the situation becomes racey as disabling > vm_event may succeed before the guest's vCPUs get scheduled with the pending > emulation task. This has been particularly the case with VMs that have several > vCPUs as after the VM is unpaused it may actually take a long time before all ?vCPUs get scheduled. > > The only solution currently is to poll each vCPU before vm_events are disabled > to verify they had been scheduled. The following patches resolve this issue in > a much nicer way. > > Patch 1 adds an option to the monitor_op domctl that needs to be specified if > the user wants to actually use the postponed register-write handling > mechanism. If that option is not specified then handling is performed the > same way as before patch 96760e2fba100d694300a81baddb5740e0f8c0ee. > > Patch 2 performs sanity checking when disabling vm_events to determine whether > its safe to free all vm_event structures. The vCPUs still get unpaused to > allow them to get scheduled and perform any of their pending operations, > but otherwise an -EAGAIN error is returned signaling to the user that they > need to wait and try again disabling the interface. > > Patch 3 adds a vm_event specifically to signal to the user when it is safe to > continue disabling the interface. > > Shout out to our friends at CERT.pl for stumbling upon a crucial piece of > information that lead to finally squashing this nasty bug. > > Tamas K Lengyel (3): > xen/monitor: Control register values > xen/vm_event: add vm_event_check_pending_op > xen/vm_event: Add safe to disable vm_event > > xen/arch/x86/hvm/hvm.c | 69 +++++++++++++++++++++++-------- > xen/arch/x86/hvm/monitor.c | 14 +++++++ > xen/arch/x86/monitor.c | 23 ++++++++++- > xen/arch/x86/vm_event.c | 23 +++++++++++ > xen/common/vm_event.c | 17 ++++++-- > xen/include/asm-arm/vm_event.h | 7 ++++ > xen/include/asm-x86/domain.h | 2 + > xen/include/asm-x86/hvm/monitor.h | 1 + > xen/include/asm-x86/vm_event.h | 2 + > xen/include/public/domctl.h | 3 ++ > xen/include/public/vm_event.h | 8 ++++ > 11 files changed, 147 insertions(+), 22 deletions(-) > > -- > 2.26.1 Hi, I have reproduced the mentioned race-condition between register event handling and disabling the vm_event interface. With Xen 4.13.0 and Windows 7 x64 DomU (2 VCPUs), my test program causes random BSODs on DomU once vm_event interface is disabled. I can confirm that after applying Patch 1 to Xen 4.13.0 the test program doesn't crash the DomU anymore, so it would actually resolve the mentioned bug. Best regards, Michał Leszczyński CERT Poland
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |