>From: Isaku Yamahata [mailto:yamahata@xxxxxxxxxxxxx]
>Sent: 2007年5月10日 10:59
>To: Xu, Anthony
>Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>Subject: Re: [Xen-ia64-devel][PATCH] handle speculative vhpt walk
>
>On Wed, May 09, 2007 at 03:28:54PM +0800, Xu, Anthony wrote:
>> diff -r eabda101b0c5 xen/arch/ia64/vmx/vmx_ivt.S
>> --- a/xen/arch/ia64/vmx/vmx_ivt.S Tue May 08 13:12:52 2007 -0600
>> +++ b/xen/arch/ia64/vmx/vmx_ivt.S Wed May 09 13:52:40 2007 +0800
>> @@ -191,10 +192,11 @@ vmx_itlb_loop:
>> dep r25 = r19, r25, 56, 4
>> ;;
>> st8 [r16] = r22
>> - st8 [r28] = r29
>> + st8 [r28] = r29, VLE_TITAG_OFFSET - VLE_ITIR_OFFSET
>> st8 [r18] = r25
>> st8 [r17] = r27
>> ;;
>> + st8 [r28] = r24
>> itc.i r25
>> dv_serialize_data
>> mov r17=cr.isr
>
>There is no memory barrier or release store unlike the
>vmx_vhpt_insert()/vhpt_insert(). Is this OK?
>Probably it is necesarry to inserte mfence or to replace some of st8
>with st8.rel.
Oops, I sent the old patch; I have added mb and rel in the new one.
Thanks,
>
>
>> @@ -269,10 +272,11 @@ vmx_dtlb_loop:
>> dep r25 = r19, r25, 56, 4
>> ;;
>> st8 [r16] = r22
>> - st8 [r28] = r29
>> + st8 [r28] = r29, VLE_TITAG_OFFSET - VLE_ITIR_OFFSET
>> st8 [r18] = r25
>> st8 [r17] = r27
>> - ;;
>> + ;;
>> + st8 [r28] = r24
>> itc.d r25
>> dv_serialize_data
>> mov r17=cr.isr
>
>ditto.
>
>
>> diff -r eabda101b0c5 xen/arch/ia64/vmx/vtlb.c
>> --- a/xen/arch/ia64/vmx/vtlb.c Tue May 08 13:12:52 2007 -0600
>> +++ b/xen/arch/ia64/vmx/vtlb.c Wed May 09 14:20:30 2007 +0800
>> @@ -175,15 +173,17 @@ static void vmx_vhpt_insert(thash_cb_t *
>> }
>> local_irq_disable();
>> *cch = *head;
>> + head->ti = 1;
>> head->next = cch;
>> - len = cch->len+1;
>> + head->len = cch->len+1;
>> cch->len = 0;
>> local_irq_enable();
>> }
>> -
>> + //here head is invalid
>> + wmb();
>> head->page_flags=pte;
>> - head->len = len;
>> head->itir = rr.ps << 2;
>> + wmb();
>> head->etag=tag;
>> return;
>> }
>
>How about this? This avoids mfence using st8.rel.
> *(volatile unsigned long*)&head->page_flags=pte;
> *(volatile unsigned long*)&head->itir = rr.ps << 2;
> *(volatile unsigned long*)&head->etag=tag;
>
>
>> diff -r eabda101b0c5 xen/arch/ia64/xen/vhpt.c
>> --- a/xen/arch/ia64/xen/vhpt.c Tue May 08 13:12:52 2007 -0600
>> +++ b/xen/arch/ia64/xen/vhpt.c Wed May 09 14:27:16 2007 +0800
>> @@ -78,10 +78,13 @@ void vhpt_insert (unsigned long vadr, un
>> struct vhpt_lf_entry *vlfe = (struct vhpt_lf_entry *)ia64_thash(vadr);
>> unsigned long tag = ia64_ttag (vadr);
>>
>> - /* No need to first disable the entry, since VHPT is per LP
>> - and VHPT is TR mapped. */
>> + /* Even though VHPT is per VCPU, still need to first disable the entry,
>> + * because the processor may support speculative VHPT walk. */
>> + vlfe->ti_tag = INVALID_TI_TAG;
>> + wmb();
>> vlfe->itir = logps;
>> vlfe->page_flags = pte | _PAGE_P;
>> + wmb();
>> vlfe->ti_tag = tag;
>> }
>>
>>
>
>ditto.
> vlfe->ti_tag = INVALID_TI_TAG;
> *(volatile unsigned long*)&vlfe->itir = logps;
> *(volatile unsigned long*)&vlfe->page_flags = pte | _PAGE_P;
> *(volatile unsigned long*)&vlfe->ti_tag = tag;
Another choice is,
vlfe->ti_tag = INVALID_TI_TAG;
wmb();
vlfe->itir = logps;
vlfe->page_flags = pte | _PAGE_P;
*(volatile unsigned long*)&vlfe->ti_tag = tag;
Do you know which one is the fastest?
Thanks,
Anthony
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|