[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 for-4.14] x86/monitor: revert default behavior when monitoring register write events


  • To: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
  • From: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>
  • Date: Tue, 9 Jun 2020 00:16:15 +0300
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=bitdefender.com; dmarc=pass action=none header.from=bitdefender.com; dkim=pass header.d=bitdefender.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1BMyGPbmxiYGh7C6IwlCMGwsJeFFFqYdhaaLHnlr1MU=; b=HfyPkxwofWW40j8I3a+cT0cf1a3hiBWFtpuJtRK0WNLpEtwcFnHqPOZITDNIuP3XxDkcAltw+M5rP6ViQoaHPUFoVh+RundO7CHZN9/zE0XJQn/BqQqo0OdnxBoCzYsErpTSAOKZYfMWFuaB7cXju6AlYiK5eKgBn1EitqsNNCDqfqZu4O9GLUeb9Xi4IKL0wdMqcnL4uG7qTGmvZ4FWp6pabJfmzVS1c5c7E8aT4L8wSMt+ZHdZIuIHL2dvIRO8CijvK0YjHBnYiFA3tmXnXQrT6/5Gb8heR9Adlul1Cu6XVp0z4IMyFsaBrE1JDq9+5HHmJ8RJn62+CO9Qev39tg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AIgwRbf5VjnUiogij1+0HRFkMm3YCKXScGhMXJgbWmr+CBAUXrH535WM8YVJuwhBi4w1xO3RwAxgNuD91afHQMkuNsf+wpCoC8oAJdgEazlzvVB1i4l5FOVMB7np5iHqwprAGFEtRCpTvJushH2Y8OBuwLNvWd0bXJT7tHbRseDwO9dqRSFbgjVGfIjPmVUFEfKUVc8HLM67tp8Koa6g8yhkDw/SPsZHq0vCJgtWr0XI2BHfu+CoZcfGK3RtKsjFdwe+2Pqvv0NIhbg7xdShmSJ3MColsFcvvIUGB0Wz9Q2gNPXS33x/qUMlwX9MPMycAreVmm+2eH+SOEF9X6Py4Q==
  • Authentication-results: bitdefender.com; dkim=none (message not signed) header.d=none;bitdefender.com; dmarc=none action=none header.from=bitdefender.com;
  • Cc: Petre Pircalabu <ppircalabu@xxxxxxxxxxxxxxx>, Andrei LUTAS <vlutas@xxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Mihai Donțu <mdontu@xxxxxxxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Tue, 09 Jun 2020 04:17:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 6/8/20 11:44 PM, Tamas K Lengyel wrote:
> On Mon, Jun 8, 2020 at 2:14 PM Razvan Cojocaru
> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>
>> On 6/8/20 10:54 PM, Tamas K Lengyel wrote:
>>> On Mon, Jun 8, 2020 at 12:58 PM Razvan Cojocaru
>>>> And last but not least, the proper sequence is: 1. unsubscribe from
>>>> register write events, 2. process all events "still in the chamber"
>>>> (keep checking the ring buffer for a while), 3. detach from the guest
>>>> (disable the vm_event subsystem). Not ideal perhaps (in that it's not
>>>> guaranteed that a VCPU won't resume after a longer period than our
>>>> timeout), but if the sequence is followed there should be no guest hangs
>>>> or crashes (at least none that we or our clients have observed so far).
>>>
>>> Incorrect. That's not enough. You also have to wait for all the vCPUs
>>> to get scheduled before disabling vm_event or otherwise the emulation
>>> is skipped entirely. Please read the patch message. If the user
>>> decides to disable vm_event after getting a CR3 event delivered, the
>>> CR3 never gets updated and results in the guest crashing in
>>> unpredictable ways. Same happens with all the other registers.
>>
>> I did read the patch message. As I've previously stated ("it's not
>> guaranteed that a VCPU won't resume after a longer period than our
>> timeout"), it's not ideal, and I've made no claim that the problem does
>> not exist or that it shouldn't be fixed - but really if you've got a
>> VCPU that will wait more than a couple of seconds to get scheduled then
>> you've got a bigger problem with the VM.
> 
> Sorry, missed the part where you say you knew about this issue. We
> didn't and it was absolutely not documented anywhere and certainly not
> mentioned in the original patch that added the feature. It literally
> took us years to figure out what the hell is going on. While as you it
> can be a couple seconds before its safe to disable it can be a lot
> longer depending on the system state, how many VMs are running and how
> many vCPUs are active on the VM. There is absolutely necessary
> use-cases where you want to keep the VM paused after a CR3 event is
> received and exit vm_event afterwards. This bug totally blocks those
> use-cases. Not to mention that it's a total mess having an interface
> where the user has to guess when its safe to do something. If this was
> pointed out when the patch was made I would have objected to that
> being merged.

No, we didn't know about the issue. It's apparently not my most eloquent
evening. I was merely saying that I did understand what the issue was
from your description, and offering an explanation on why we've never
seen this in QA or production. Of course the issue would have been
mentioned at least (but ideally not exist to begin with) had it been known.

We do take several passes through the ring buffer (and as a side-effect
reduce the probability of a race occuring to almost zero), but we do
that to make sure we've cleaned up all EPT vm_events; the fact that it
has helped with _this_ issue is a coincidence.

IIRC Windows, at least, will be upset if a VCPU is stuck for too long.

As for how the vm_event system behaves:

1. A security application that is unable to _prevent_ things being done
to a system is not doing a very good job, since in that case you can
only collect stats and not veto anything. I would argue that the default
for such a monitoring application should be the current one (all events
should be pre-action).

2. This is further supported by the fact that that's exactly how the EPT
vm_events work: you get a "I want to write to this page" event _before_
the write occurs. If register writes behave differently, you have _two_
different models: one where you get an event post-action, and one where
you get one pre-action.


Hope that's clearer,
Razvan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.