|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 20/25] arm/altp2m: Add altp2m paging mechanism.
Hi Julien,
On 08/06/2016 04:14 PM, Julien Grall wrote:
> On 06/08/2016 13:51, Sergej Proskurin wrote:
>> Hi Julien,
>
> Hello Sergej,
>
>> On 08/04/2016 03:50 PM, Julien Grall wrote:
>>> Hello Sergej,
>>>
>>> On 01/08/16 18:10, Sergej Proskurin wrote:
>>>> This commit adds the function "altp2m_lazy_copy" implementing the
>>>> altp2m
>>>> paging mechanism. The function "altp2m_lazy_copy" lazily copies the
>>>> hostp2m's mapping into the currently active altp2m view on 2nd stage
>>>> translation violations on instruction or data access. Every altp2m
>>>> violation generates a vm_event.
>>>
>>> I think you want to "translation fault" and not "violation". The
>>> latter looks more a permission issue whilst it is not the case here.
>>>
>>
>> The implemented paging mechanism also covers permission issues, which is
>> the reason why I chose the term violation. By this, I mean that the
>> implementation covers traps to Xen occured due to 2nd stage permission
>> violations (as altp2m view's might have set different access permissions
>> to trap on). Also, the implementation covers translation faults due to
>> not present entries in the active altp2m view, as well.
>
> FSC_FLT_TRANS can only happen for a translation fault. And you don't
> modify FSC_FLT_PERM (except move coding around as usual...). So why
> are you talking about permission violations?
>
I'm afraid there was a misunderstanding from my side. You are right, I
will change this in the next patch.
>>
>>> However I am not sure what you mean by "every altp2m violation
>>> generates a vm_event". Do you mean the userspace will be aware of it?
>>>
>>
>> No. Every time, the altp2m's configuration lets the guest trap into Xen
>> due to a lack of memory access permissions (e.g., on execution of a rw
>> page), we fill the associated fields in the req buffer in
>> mem_access_check so that the management domain receives the information
>> required to understand what kind of altp2m violation just happened.
>> Based on this information, it might decide what to do next (perform
>> additional checks or simply change the altp2m view to continue guest
>> execution).
>
> You will receive a FSC_FLT_PERM in this case and not a FSC_FLT_TRANS.
>
I do agree now.
> [...]
>
>>>> + struct npfec npfec,
>>>> + struct p2m_domain **ap2m)
>>>
>>> Why do you need the parameter ap2m? None of the callers make use of it
>>> except setting it.
>>>
>>
>> True. Another leftover from the x86 implementation. I will change that.
>>
>>>> +{
>>>> + struct domain *d = v->domain;
>>>> + struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
>>>
>>> p2m_get_hostp2m(d);
>>>
>>
>> Thanks.
>>
>>>> + p2m_type_t p2mt;
>>>> + xenmem_access_t xma;
>>>> + gfn_t gfn = _gfn(paddr_to_pfn(gpa));
>>>> + mfn_t mfn;
>>>> + unsigned int level;
>>>> + int rc = 0;
>>>
>>> Please use true/false rather than 0/1. Also this should be bool_t.
>>>
>>
>> Ok.
>>
>>>> +
>>>> + static const p2m_access_t memaccess[] = {
>>>> +#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac
>>>> + ACCESS(n),
>>>> + ACCESS(r),
>>>> + ACCESS(w),
>>>> + ACCESS(rw),
>>>> + ACCESS(x),
>>>> + ACCESS(rx),
>>>> + ACCESS(wx),
>>>> + ACCESS(rwx),
>>>> + ACCESS(rx2rw),
>>>> + ACCESS(n2rwx),
>>>> +#undef ACCESS
>>>> + };
>>>> +
>>>> + *ap2m = altp2m_get_altp2m(v);
>>>> + if ( *ap2m == NULL)
>>>> + return 0;
>>>> +
>>>> + /* Check if entry is part of the altp2m view */
>>>> + mfn = p2m_lookup_attr(*ap2m, gfn, NULL, NULL, NULL, NULL);
>>>> + if ( !mfn_eq(mfn, INVALID_MFN) )
>>>> + goto out;
>>>> +
>>>> + /* Check if entry is part of the host p2m view */
>>>> + mfn = p2m_lookup_attr(hp2m, gfn, &p2mt, &level, NULL, &xma);
>>>> + if ( mfn_eq(mfn, INVALID_MFN) )
>>>> + goto out;
>>>
>>> This is quite racy. The page could be removed from the host p2m by the
>>> time you have added it in the altp2m because you have no lock.
>>>
>>
>> In the last patch series, I proposed to lock the hp2m->lock
>> (write_lock), which is not a good solution at this point, as it would
>> potentially create lots of contention on hp2m. Also, we would need to
>> export __p2m_lookup or remove the lock from p2m_lookup_attr, which is
>> not a good solution either.
>
> The P2M lock has been converted to a read-write lock. So it will not
> create contention as multiple read can be done concurrently.
>
> You should be more concerned about security than contention. The
> former may lead to exploit or corrupting Xen. The latter will only
> impact performance.
>
>>
>> The addition of the altp2m_lock could help in this case: If a page would
>> be removed from the hostp2m before we added it in the altp2m view, the
>> propagate function would need to wait until lazy_copy would finish and
>> eventually remove it from the altp2m view. But on the other hand, it
>> would highly decrease the performance on a multi core system.
>>
>> If I understand that correctly, a better solution would be to use a
>> p2m_read_lock(hp2m) as we would still allow reading but not writing (in
>> case the hp2m gets entries removed in apply_p2m_changes). That is, I
>> would set it right before p2m_lookup_attr(hp2m, ...) and release it
>> right after modify_altp2m_entry. This solution would not present a
>> bottleneck on the lazy_copy mechanism, and simultaneously prevent hp2m
>> from changing. What do you think?
>
> That's the solution I would like to see and the only safe one.
>
Great, I will do that. Thank you.
>>
>>>> +
>>>> + rc = modify_altp2m_entry(d, *ap2m, gpa,
>>>> pfn_to_paddr(mfn_x(mfn)), level,
>>>> + p2mt, memaccess[xma]);
>>>
>>> Please avoid to mix bool and int even though today we have implicitly
>>> conversion.
>>>
>>
>> Ok.
>>
>>>> + if ( rc )
>>>> + {
>>>> + gdprintk(XENLOG_ERR, "failed to set entry for %lx -> %lx p2m
>>>> %lx\n",
>>>> + (unsigned long)gpa, (unsigned
>>>> long)(paddr_to_pfn(mfn_x(mfn))),
>>>
>>> By using (unsigned long) you will truncate the address on ARM32
>>> because we are able to support up to 40 bits address.
>>>
>>> Also why do you print the full address? The guest physical address may
>>> not be page-aligned so it will confuse the user.
>>>
>>
>> x86 leftover. I will change that.
>
> It is not in the x86 code....
>
>>>> + (unsigned long)*ap2m);
>>>
>>> It does not seem really helpful to print the pointer here. You will
>>> not be able to exploit it when reading the log. Also this should be
>>> printed with "%p" and not using a cast.
>>>
>>
>> Another x86 leftover. I will change that.
>>
>>>> + domain_crash(hp2m->domain);
>>>> + }
>>>> +
>>>> + rc = 1;
>>>> +
>>>> +out:
>>>> + return rc;
>>>> +}
>>>> +
>>>> static inline void altp2m_reset(struct p2m_domain *p2m)
>>>> {
>>>> read_lock(&p2m->lock);
>>>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>>>> index 31810e6..bee8be7 100644
>>>> --- a/xen/arch/arm/p2m.c
>>>> +++ b/xen/arch/arm/p2m.c
>>>> @@ -1812,6 +1812,12 @@ void __init setup_virt_paging(void)
>>>> smp_call_function(setup_virt_paging_one, (void *)val, 1);
>>>> }
>>>>
>>>> +void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
>>>> +{
>>>> + if ( altp2m_active(v->domain) )
>>>> + altp2m_switch_vcpu_altp2m_by_id(v, idx);
>>>> +}
>>>> +
>>>
>>> I am not sure why this function lives here.
>>>
>>
>> This function is used in ./xen/common/vm_event.c. Since the function is
>> used from a common place and already named p2m_* I did not want to pull
>> it out of p2m.c (and have a p2m_* file/function prefix in altp2m.c).
>> However, I could move the function to altp2m.c rename it globally (also
>> in the x86 implementation). Or, I could simply move it to altp2m despite
>> the name. What would you prefer?
>
> It seems that the altp2m functions on x86 will be move from p2m.c,
> altp2m.c. So you may want to rename the function here.
>
Alright.
> [...]
>
>>>> + * currently active altp2m view.
>>>> + */
>>>> + if ( altp2m_lazy_copy(v, gpa, gva, npfec, &p2m) )
>>>> + return;
>>>> +
>>>> + rc = p2m_mem_access_check(gpa, gva, npfec);
>>>
>>> Why do you call p2m_mem_access_check here? If you are here it is for a
>>> translation fault which you handle via altp2m_lazy_copy.
>>>
>>
>> Right. I have experienced that the test systems jump into the case
>> FSC_FLT_TRANS, right after I have lazily-copied the page into the
>> associated altp2m view. Not sure what the issue might be here.
>
> I don't understand. What do you mean?
>
Never mind: It was the misunderstanding I was mentioning at the
beginning of this email. The call to p2m_mem_access_check will be
removed in the next patch. Thank you.
>>
>>>> +
>>>> + /* Trap was triggered by mem_access, work here is done */
>>>> + if ( !rc )
>>>> + return;
>>>> + }
>>>> +
>>>> + break;
>>>> + }
>>>> case FSC_FLT_PERM:
>>>> {
>>>> - paddr_t gpa;
>>>> const struct npfec npfec = {
>>>> .insn_fetch = 1,
>>>> .gla_valid = 1,
>>>> .kind = hsr.iabt.s1ptw ? npfec_kind_in_gpt :
>>>> npfec_kind_with_gla
>>>> };
>>>>
>>>> - if ( hpfar_is_valid(hsr.iabt.s1ptw, fsc) )
>>>> - gpa = get_faulting_ipa(gva);
>>>> - else
>>>> - {
>>>> - /*
>>>> - * Flush the TLB to make sure the DTLB is clear before
>>>> - * doing GVA->IPA translation. If we got here because of
>>>> - * an entry only present in the ITLB, this translation
>>>> may
>>>> - * still be inaccurate.
>>>> - */
>>>> - flush_tlb_local();
>>>> -
>>>> - rc = gva_to_ipa(gva, &gpa, GV2M_READ);
>>>> - if ( rc == -EFAULT )
>>>> - return; /* Try again */
>>>> - }
>>>> -
>>>> rc = p2m_mem_access_check(gpa, gva, npfec);
>>>>
>>>> /* Trap was triggered by mem_access, work here is done */
>>>> @@ -2451,6 +2482,8 @@ static void do_trap_data_abort_guest(struct
>>>> cpu_user_regs *regs,
>>>> int rc;
>>>> mmio_info_t info;
>>>> uint8_t fsc = hsr.dabt.dfsc & ~FSC_LL_MASK;
>>>> + struct vcpu *v = current;
>>>> + struct p2m_domain *p2m = NULL;
>>>>
>>>> info.dabt = dabt;
>>>> #ifdef CONFIG_ARM_32
>>>> @@ -2459,7 +2492,7 @@ static void do_trap_data_abort_guest(struct
>>>> cpu_user_regs *regs,
>>>> info.gva = READ_SYSREG64(FAR_EL2);
>>>> #endif
>>>>
>>>> - if ( hpfar_is_valid(hsr.iabt.s1ptw, fsc) )
>>>> + if ( hpfar_is_valid(hsr.dabt.s1ptw, fsc) )
>>>> info.gpa = get_faulting_ipa(info.gva);
>>>> else
>>>> {
>>>> @@ -2470,23 +2503,31 @@ static void do_trap_data_abort_guest(struct
>>>> cpu_user_regs *regs,
>>>>
>>>> switch ( fsc )
>>>> {
>>>> - case FSC_FLT_PERM:
>>>> + case FSC_FLT_TRANS:
>>>> {
>>>> - const struct npfec npfec = {
>>>> - .read_access = !dabt.write,
>>>> - .write_access = dabt.write,
>>>> - .gla_valid = 1,
>>>> - .kind = dabt.s1ptw ? npfec_kind_in_gpt :
>>>> npfec_kind_with_gla
>>>> - };
>>>> + if ( altp2m_active(current->domain) )
>>>
>>> I would much prefer to this checking altp2m only if the MMIO was not
>>> emulated (so moving the code afterwards). This will avoid to add
>>> overhead when access the virtual interrupt controller.
>>
>> I am not sure whether I understood your request. Could you be more
>> specific please? What excatly shall be moved where?
>
> With this patch, the translation fault will do:
> 1) Check altp2m
> 2) Emulate MMIO
>
> So if altp2m is enabled, you will add overhead for any MMIO access
> (such as the virtual interrupt controller).
>
> I would much prefer to see 2) then 1).
>
I see. I will change that. Thanks.
> [...]
>
>>>> +
>>>> + /* Trap was triggered by mem_access, work here is done */
>>>> + if ( !rc )
>>>> + return;
>>>> + }
>>>>
>>>> - /* Trap was triggered by mem_access, work here is done */
>>>> - if ( !rc )
>>>> - return;
>>>> - break;
>>>> - }
>>>> - case FSC_FLT_TRANS:
>>>> if ( dabt.s1ptw )
>>>> goto bad_data_abort;
>>>>
>>>> @@ -2515,6 +2556,23 @@ static void do_trap_data_abort_guest(struct
>>>> cpu_user_regs *regs,
>>>> return;
>>>> }
>>>> break;
>>>> + }
>>>> + case FSC_FLT_PERM:
>>>> + {
>>>> + const struct npfec npfec = {
>>>> + .read_access = !dabt.write,
>>>> + .write_access = dabt.write,
>>>> + .gla_valid = 1,
>>>> + .kind = dabt.s1ptw ? npfec_kind_in_gpt :
>>>> npfec_kind_with_gla
>>>> + };
>>>> +
>>>> + rc = p2m_mem_access_check(info.gpa, info.gva, npfec);
>>>> +
>>>> + /* Trap was triggered by mem_access, work here is done */
>>>> + if ( !rc )
>>>> + return;
>>>> + break;
>>>> + }
>>>
>>> Why did you move the case handling FSC_FLT_PERM?
>>>
>>
>> I really important reason: I moved it simply because int(FSC_FLT_TRANS)
>> < int(FSC_FLT_PERM). I can move it back if you like.
>
> Again, you should not make a such change in the same patch introduce
> new code. It took me a while to understand that you only move code.
>
> This series is already difficult enough to review. So please don't
> have extra work by mixing code movement and new code in the same patch.
>
I will try to avoid unnecessary code movement.
Best regards,
~Sergej
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |