[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Clarification regarding MEM_ACCESS_* flags usage





On Thu, Oct 6, 2016 at 3:59 AM, Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> wrote:
On 10/05/2016 11:54 PM, Julien Grall wrote:
>
>
> On 05/10/2016 13:23, Tamas K Lengyel wrote:
>> Hi Julien,
>> It is expected that certain combinations of mem_access flags will put
>> the domain into unstable condition, resulting in a crash or a hang. As
>> Razvan mentioned, on x86 we can end up triggering EPT misconfiguration
>> with the wrong set of flags. The user of the API is expected to know
>> what he/she is doing in this regard, we don't do any enforcements or
>> sanity checking on the Xen side.
>>
>> As to the issue you describe, indeed that can happen. If the user marks
>> a pagetable area non-readable/non-writable and the way ARM reports a
>> walk for an instruction-fetch as an execute violation when it traps, it
>> will hang the VM in a continuous violation state as no execute-violation
>> was requested to be triggered on the gfn by the user. There are other
>> situations where this can happen, as on ARM there is no such thing as
>> execute-only memory, so any time the user requests memory to be
>> execute-only or writable-executable will lead to problems like this -
>> instruction fetch violation when the user only requested
>> read-violations. But again, the users are expected to know what they are
>> doing and perform their own sanity checks as appropriate.
>
> I think the problem I described is neither the fault of the user,
> neither a misconfiguration of the page table. Let me clarify it.
>
> The user can purposefully restrict the access to stage-1 page table to
> detect when the OS is modifying them. By side effect, this will also
> impact the page table walker.
>
> A prefetch abort (e.g when an error occurs when the processor is trying
> to load the instruction) can either occur during a stage-1 page table
> walk (e.g the underlying memory of stage-1 page table has been
> protected) or because the permission in the stage-2 entry has been
> restricted.
>
> In the case of the latter, this will always be because the memory is not
> executable. However, for the former may happen if the page table walker
> (i.e the MMU) is reading/writing the entry.
>
> However, Xen ARM today is always considering that a prefetch abort will
> happen because it was not possible to execute the instruction.
>
> I requested clarification about the flags because we need to fix this
> valid issue. From the usage on ARM and in the vm event app, it is not
> clear how those flags should be used.

I understand. FWIW, I find it better to have the most precise type of
event sent, i.e. in your case if the application gets a read-only page
fault event it would then be able to do something about it (for example,
lift the restrictions on the page), whereas if it would get an execute
denied event in this case, allowing execution on that page would not
solve the issue and leave the guest in an infinite loop, as you say. The
problem here is that the application never gets a chance to do the right
thing even if it wants to, and is capable of that.

So I'm all for properly differentiating between these two cases, unless
the ARM SDM disagrees or there's some reason why this is unfeasible.

The issue I see here is that if the CPU itself traps as an instruction fetch violation because the pagetable was unreadable, then sending out a vm_event with a MEM_ACCESS_* type other then what the hardware reported will complicate things significantly. It would require the mem_access system in Xen to further check when there is no violating mem_access X setting found to check if all pages used for translating the PC were readable or not. This would require us to walk through the currently active pagetable and check if any of those have a restricted mem_access setting, and if one is found send out a notification with MEM_ACCESS_R flag set. This is pretty complicated considering all the different page types the OS could use. I rather not move this logic into Xen but have the user implement it if it is needed. For example, if the user wants to make the pages where pagetables reside unreadable with mem_access then would also have to mark all pages contained in that pagetable non-executable with mem_access. So since the current setup can be worked with, I rather not complicated the Xen side and just have it accurately report the trap as it received it from the CPU itself.

Tamas
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.