|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [V11 PATCH 17/21] PVH xen: vmcs related changes
>>> On 23.08.13 at 03:19, Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> wrote:
> This patch contains vmcs changes related for PVH, mainly creating a VMCS
> for PVH guest.
>
> Changes in V11:
> - Remove pvh_construct_vmcs and make it part of construct_vmcs
>
> Signed-off-by: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
> Acked-by: Keir Fraser <keir@xxxxxxx>
> Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> PV-HVM-Regression-Tested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Same comment as for #9 - too many changes to retain the tags.
> @@ -874,16 +935,37 @@ static int construct_vmcs(struct vcpu *v)
> if ( d->arch.vtsc )
> v->arch.hvm_vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
>
> - v->arch.hvm_vmx.secondary_exec_control = vmx_secondary_exec_control;
> + if ( is_pvh_vcpu(v) )
> + {
> + /* Phase I PVH: we run with minimal secondary exec features */
> + u32 bits = SECONDARY_EXEC_ENABLE_EPT | SECONDARY_EXEC_ENABLE_VPID;
>
> - /* Disable VPID for now: we decide when to enable it on VMENTER. */
> - v->arch.hvm_vmx.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VPID;
> + /* Intel SDM: resvd bits are 0 */
> + v->arch.hvm_vmx.secondary_exec_control = bits;
> + }
> + else
> + {
> + v->arch.hvm_vmx.secondary_exec_control = vmx_secondary_exec_control;
> +
> + /* Disable VPID for now: we decide when to enable it on VMENTER. */
> + v->arch.hvm_vmx.secondary_exec_control &=
> ~SECONDARY_EXEC_ENABLE_VPID;
> + }
So this difference in VPID handling needs some explanation, as I
don't immediately see how you adjust the code referred to by the
non-PVH comment above.
> if ( paging_mode_hap(d) )
> {
> v->arch.hvm_vmx.exec_control &= ~(CPU_BASED_INVLPG_EXITING |
> CPU_BASED_CR3_LOAD_EXITING |
> CPU_BASED_CR3_STORE_EXITING);
> + if ( is_pvh_vcpu(v) )
> + {
> + u32 bits = CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
> + CPU_BASED_ACTIVATE_MSR_BITMAP;
> + v->arch.hvm_vmx.exec_control |= bits;
> +
> + bits = CPU_BASED_USE_TSC_OFFSETING | CPU_BASED_TPR_SHADOW |
> + CPU_BASED_VIRTUAL_NMI_PENDING;
> + v->arch.hvm_vmx.exec_control &= ~bits;
> + }
Did you notice that the original adjustments were truly HAP-related?
Putting your PVH code here just because it takes HAP as a prereq is
not the way to go. Wherever the flags you want set get set for the
HVM case, you should add your adjustments (if any are necessary
at all).
> @@ -905,9 +987,22 @@ static int construct_vmcs(struct vcpu *v)
>
> vmx_update_cpu_exec_control(v);
> __vmwrite(VM_EXIT_CONTROLS, vmexit_ctl);
> +
> + if ( is_pvh_vcpu(v) )
> + {
> + /*
> + * Note: we run with default VM_ENTRY_LOAD_DEBUG_CTLS of 1, which
> means
> + * upon vmentry, the cpu reads/loads VMCS.DR7 and VMCS.DEBUGCTLS, and
> + * not use the host values. 0 would cause it to not use the VMCS
> values.
> + */
> + vmentry_ctl &= ~(VM_ENTRY_LOAD_GUEST_EFER | VM_ENTRY_SMM |
> + VM_ENTRY_DEACT_DUAL_MONITOR);
> + /* PVH 32bitfixme. */
> + vmentry_ctl |= VM_ENTRY_IA32E_MODE; /* GUEST_EFER.LME/LMA ignored
> */
> + }
> __vmwrite(VM_ENTRY_CONTROLS, vmentry_ctl);
This is misplaced too - we're past the point of determining the set of
flags, and already in the process of committing them. The code
again should go alongside where the corresponding HVM code sits.
> - if ( cpu_has_vmx_ple )
> + if ( cpu_has_vmx_ple && !is_pvh_vcpu(v) )
> {
> __vmwrite(PLE_GAP, ple_gap);
> __vmwrite(PLE_WINDOW, ple_window);
Why would this be conditional upon !PVH?
> @@ -921,28 +1016,46 @@ static int construct_vmcs(struct vcpu *v)
> if ( cpu_has_vmx_msr_bitmap )
> {
> unsigned long *msr_bitmap = alloc_xenheap_page();
> + int msr_type = MSR_TYPE_R | MSR_TYPE_W;
>
> if ( msr_bitmap == NULL )
> + {
> + vmx_vmcs_exit(v);
> return -ENOMEM;
> + }
This is a genuine bug fix, which should be contributed separately
(so it becomes backportable).
> - vmx_disable_intercept_for_msr(v, MSR_FS_BASE, MSR_TYPE_R |
> MSR_TYPE_W);
> - vmx_disable_intercept_for_msr(v, MSR_GS_BASE, MSR_TYPE_R |
> MSR_TYPE_W);
> - vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_CS, MSR_TYPE_R |
> MSR_TYPE_W);
> - vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_ESP, MSR_TYPE_R |
> MSR_TYPE_W);
> - vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_EIP, MSR_TYPE_R |
> MSR_TYPE_W);
> + /* Disable interecepts for MSRs that have corresponding VMCS fields.
> */
> + vmx_disable_intercept_for_msr(v, MSR_FS_BASE, msr_type);
> + vmx_disable_intercept_for_msr(v, MSR_GS_BASE, msr_type);
> + vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_CS, msr_type);
> + vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_ESP, msr_type);
> + vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_EIP, msr_type);
> +
> if ( cpu_has_vmx_pat && paging_mode_hap(d) )
> - vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT, MSR_TYPE_R |
> MSR_TYPE_W);
> + vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT, msr_type);
These changes all look pointless; if you really think they're worthwhile
cleanup, they don't belong here.
> - if ( cpu_has_vmx_virtual_intr_delivery )
> + if ( cpu_has_vmx_virtual_intr_delivery && !is_pvh_vcpu(v) )
> {
> /* EOI-exit bitmap */
> v->arch.hvm_vmx.eoi_exit_bitmap[0] = (uint64_t)0;
> @@ -958,7 +1071,7 @@ static int construct_vmcs(struct vcpu *v)
> __vmwrite(GUEST_INTR_STATUS, 0);
> }
>
> - if ( cpu_has_vmx_posted_intr_processing )
> + if ( cpu_has_vmx_posted_intr_processing && !is_pvh_vcpu(v) )
> {
> __vmwrite(PI_DESC_ADDR, virt_to_maddr(&v->arch.hvm_vmx.pi_desc));
> __vmwrite(POSTED_INTR_NOTIFICATION_VECTOR, posted_intr_vector);
These are likely to be meant as temporary changes only? In which
case they ought to be marked as such.
> @@ -1032,16 +1150,36 @@ static int construct_vmcs(struct vcpu *v)
>
> v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
> | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault))
> + | (is_pvh_vcpu(v) ? (1U << TRAP_int3) | (1U << TRAP_debug) : 0)
> | (1U << TRAP_no_device);
What's so special about PVH that it requires these extra intercepts?
> v->arch.hvm_vcpu.guest_cr[0] = X86_CR0_PE | X86_CR0_ET;
> - hvm_update_guest_cr(v, 0);
> + if ( is_pvh_vcpu(v) )
I dislike your attitude of considering PVH the most important (thus
handled first) mode in general, but here it becomes most obvious:
The pre-existing code should really come first, and the new mode
should be handled in the "else" path. Same further down for the
CR4 handling.
> + {
> + u64 tmpval = v->arch.hvm_vcpu.guest_cr[0] | X86_CR0_PG | X86_CR0_NE;
> + v->arch.hvm_vcpu.hw_cr[0] = v->arch.hvm_vcpu.guest_cr[0] = tmpval;
So you set hw_cr[0] while I saw you saying hw_cr[4] won't be
touched by PVH in a reply to the v10 series. Such
inconsistencies are just calling for subsequent bugs. Either PVH
set all valid hw_cr[] fields, or it clearly states somewhere that
none of them are used (and then the assignment above ought to
go away).
> @@ -1297,6 +1440,9 @@ void vmx_do_resume(struct vcpu *v)
> hvm_asid_flush_vcpu(v);
> }
>
> + if ( is_pvh_vcpu(v) )
> + reset_stack_and_jump(vmx_asm_do_vmentry);
> +
> debug_state = v->domain->debugger_attached
> ||
> v->domain->arch.hvm_domain.params[HVM_PARAM_MEMORY_EVENT_INT3]
> ||
> v->domain->arch.hvm_domain.params[HVM_PARAM_MEMORY_EVENT_SINGLE_STEP];
Missing a "fixme" annotation?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |