[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V5 2/3] x86/mm: allocate logdirty_ranges for altp2ms



On 11/14/18 4:00 PM, Jan Beulich wrote:
>>>> On 14.11.18 at 13:50, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> On 11/14/18 1:58 PM, George Dunlap wrote:
>>> On 11/13/18 6:43 PM, Razvan Cojocaru wrote:
>>>> On 11/13/18 7:57 PM, George Dunlap wrote:
>>>>> On 11/11/18 2:07 PM, Razvan Cojocaru wrote:
>>>>>> @@ -2341,6 +2380,7 @@ int p2m_destroy_altp2m_by_id(struct domain *d, 
>>>>>> unsigned int idx)
>>>>>>          {
>>>>>>              p2m_flush_table(d->arch.altp2m_p2m[idx]);
>>>>>>              /* Uninit and reinit ept to force TLB shootdown */
>>>>>> +            p2m_free_logdirty(d->arch.altp2m_p2m[idx]);
>>>>>>              ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
>>>>>>              ept_p2m_init(d->arch.altp2m_p2m[idx]);
>>>>>>              d->arch.altp2m_eptp[idx] = mfn_x(INVALID_MFN);
>>>>>
>>>>> (In case I forget: Also, this is called without holding the appropriate
>>>>> p2m lock. )
>>>>
>>>> Could you please provide more details? I have assumed that at the point
>>>> of calling a function called p2m_destroy_altp2m_by_id() it should be
>>>> safe to tear the altp2m down without further precaution.
>>>
>>> Are you absolutely positive that at this point there's no way anywhere
>>> else in Xen might be doing something with this p2m struct?
>>>
>>> If so, then 1) there should be a comment there explaining why that's the
>>> case, and 2) ideally we should refactor p2m_flush_table such that we can
>>> call what's now p2m_flush_table_locked() without the lock.
>>
>> AFAICT the only place p2m_destroy_altp2m_by_id() is ever called is in
>> arch/x86/hvm/hvm.c's do_altp2m_op() (on HVMOP_altp2m_destroy_p2m), which
>> is done under domain lock. Is that insufficient?
> 
> Holding the domain lock does not imply nothing can happen to the
> domain elsewhere. Only if both parties hold the _same_ lock there
> is a guarantee of serialization between both.

Right, I was under the impression that for the duration of a HVMOP (or
DOMCTL) nothing moves in the domain.

In that case, we do need the locking as George has suggested.

>>>>>> Is there any particular reason we allocate the p2m structures on domain
>>>>>> creation, but not logdirty range structures?  It seems like allocating
>>>>>> altp2m structures on-demand, rather than at domain creation time, might
>>>>>> make a lot of the reasoning here simpler.
>>>>>
>>>>> I assume that this question is not addressed to me, since I'm not able
>>>>> to answer it - I can only assume that having less heap used has been
>>>>> preferred.
>>>
>>> I'm asking you because you've recently been going through this code, and
>>> probably have at least as much familiarity with it as I do.  I can't
>>> immediately see any reason to allocate them at domain creation time.
>>> Maybe you can and maybe you can't, but I won't know until I ask. :-)
>>
>> I've looked at the code closer today, and there's no reason as far as I
>> can tell why we shouldn't allocate altp2ms on-demand. However, changing
>> the code is somewhat involved at this point, since there's a lot of:
>>
>> 2357     if ( d->arch.altp2m_eptp[idx] != mfn_x(INVALID_MFN) )
>> 2358     {
>> 2359         p2m = d->arch.altp2m_p2m[idx];
>> 2360
>> 2361         if ( !_atomic_read(p2m->active_vcpus) )
>> 2362         {
>> 2363             p2m_flush_table(d->arch.altp2m_p2m[idx]);
>> 2364             /* Uninit and reinit ept to force TLB shootdown */
>> 2365             ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
>> 2366             ept_p2m_init(d->arch.altp2m_p2m[idx]);
>> 2367             d->arch.altp2m_eptp[idx] = mfn_x(INVALID_MFN);
>> 2368             rc = 0;
>> 2369         }
>> 2370     }
>>
>> going on. That is, code checking that d->arch.altp2m_eptp[idx] !=
>> mfn_x(INVALID_MFN), and then blindly assuming that p2m will not be NULL
>> and is usable.
> 
> Wouldn't the implication of George's proposal be that
> d->arch.altp2m_eptp[] slots get demand-populated, too?

Of course, but still we must make sure that that really does happen in
all (corner) cases, and that the two never get out of sync (some
function sets d->arch.altp2m_p2m[idx] to NULL while leaving
d->arch.altp2m_eptp[idx] != mfn_x(INVALID_MFN), for example).

My point was just that this requires quite a bit of testing to make sure
we got it right IMHO. Which we should do, but it's also very important
to get the display problem fixed.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.