|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Commit 1aeb1156fa43fe2cd2b5003995b20466cd19a622: "x86 don't change affinity with interrupt unmasked", APCI errors and assorted pci trouble
Friday, April 17, 2015, 1:43:32 PM, you wrote:
>>>> On 14.04.15 at 14:46, <linux@xxxxxxxxxxxxxx> wrote:
>> I just had a hunch .. could it be related to the kernel apci/irq refactoring
>> series of Jiang Liu, that already caused a lot of trouble in 3.17, 3.18 and
>> 3.19
>> with Xen. And yes that seems to be the case:
>>
>> On Xen without "x86 don't change affinity with interrupt unmasked"
>> - 3.16 && 3.19 && 4.0 all work fine
>>
>> On Xen with "x86 don't change affinity with interrupt unmasked"
>> - 3.16 (which is before that kernel refactoring series) works fine.
>> - 3.19, 4.0 both give the dom0 kernel hangs and the :
>> (XEN) [2015-03-26 20:35:42.205] APIC error on CPU0: 00(40)
>> (XEN) [2015-03-26 20:35:42.372] APIC error on CPU0: 40(40)
>>
>> (haven't tested 3.17 and 3.18 because these have asorted problems due that
>> series that weren't fixed in time before stable updates ended.)
>>
>> So it seems Jan's patch seems to interfere with that patch series.
> That's rather odd a finding - the patch in question in fact uncovered
> a bug introduced in 2ca9fbd739 ("AMD IOMMU: allocate IRTE entries
> instead of using a static mapping") in that IO-APIC RTE reads would
> unconditionally translate the data (i.e. regardless of whether the
> entry was already in translated format). The patch below fixes this
> for me - can you please give this a try too?
> Thanks, Jan
> --- unstable.orig/xen/drivers/passthrough/amd/iommu_intr.c
> +++ unstable/xen/drivers/passthrough/amd/iommu_intr.c
> @@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
> unsigned int apic, unsigned int reg)
> {
> unsigned int val = __io_apic_read(apic, reg);
> + unsigned int pin = (reg - 0x10) / 2;
> + unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
>
> - if ( !(reg & 1) )
> + if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
> {
> - unsigned int offset = val & (INTREMAP_ENTRIES - 1);
> u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
> u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
> u16 req_id = get_intremap_requestor_id(seg, bdf);
> const u32 *entry = get_intremap_entry(seg, req_id, offset);
>
> + ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
> val &= ~(INTREMAP_ENTRIES - 1);
> val |= get_field_from_reg_u32(*entry,
> INT_REMAP_ENTRY_INTTYPE_MASK,
Hmmm can this patch or tim's patch make andrew's patch ineffective ?
I now have applied:
Jan's:
diff --git a/xen/drivers/passthrough/amd/iommu_intr.c
b/xen/drivers/passthrough/amd/iommu_intr.c
index c1b76fb..879698e 100644
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_ire(
unsigned int apic, unsigned int reg)
{
unsigned int val = __io_apic_read(apic, reg);
+ unsigned int pin = (reg - 0x10) / 2;
+ unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
- if ( !(reg & 1) )
+ if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
{
- unsigned int offset = val & (INTREMAP_ENTRIES - 1);
u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
u16 req_id = get_intremap_requestor_id(seg, bdf);
const u32 *entry = get_intremap_entry(seg, req_id, offset);
+ ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
val &= ~(INTREMAP_ENTRIES - 1);
val |= get_field_from_reg_u32(*entry,
INT_REMAP_ENTRY_INTTYPE_MASK,
Tim's:
@@ -529,10 +531,11 @@ int amd_iommu_msi_msg_update_ire(
} while ( PCI_SLOT(bdf) == PCI_SLOT(pdev->devfn) );
if ( !rc )
+ {
for ( i = 1; i < nr; ++i )
msi_desc[i].remap_index = msi_desc->remap_index + i;
-
- msg->data = data;
+ msg->data = data;
+ }
return rc;
}
Andrew's:
diff --git a/xen/drivers/passthrough/x86/iommu.c
b/xen/drivers/passthrough/x86/iommu.c
index 9eb8d33..3aee00c 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -56,9 +56,9 @@ int arch_iommu_populate_page_table(struct domain *d)
while ( !rc && (page = page_list_remove_head(&d->page_list)) )
{
- if ( has_hvm_container_domain(d) ||
- (page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page )
- {
+ if ( (mfn_to_gmfn(d, page_to_mfn(page)) != INVALID_MFN) &&
+ (has_hvm_container_domain(d) ||
+ ((page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page))
)
BUG_ON(SHARED_M2P(mfn_to_gmfn(d, page_to_mfn(page))));
rc = hd->platform_ops->map_page(
d, mfn_to_gmfn(d, page_to_mfn(page)), page_to_mfn(page),
And i now have this one again (which Andrew's patch should prevent):
(XEN) [2015-04-17 15:00:55.954] Xen call trace:
(XEN) [2015-04-17 15:00:55.954] [<ffff82d080155f51>]
iommu_pde_from_gfn+0x38/0x430
(XEN) [2015-04-17 15:00:55.954] [<ffff82d080156456>]
amd_iommu_map_page+0x10d/0x4e6
(XEN) [2015-04-17 15:00:55.954] [<ffff82d08015a93d>]
arch_iommu_populate_page_table+0x179/0x4d8
(XEN) [2015-04-17 15:00:55.954] [<ffff82d08014ca61>]
iommu_do_pci_domctl+0x395/0x604
(XEN) [2015-04-17 15:00:55.954] [<ffff82d08014942b>]
iommu_do_domctl+0x17/0x1a
(XEN) [2015-04-17 15:00:55.954] [<ffff82d080161f70>]
arch_do_domctl+0x24ac/0x2724
(XEN) [2015-04-17 15:00:55.954] [<ffff82d080104ae8>] do_domctl+0x1a98/0x1df0
(XEN) [2015-04-17 15:00:55.954] [<ffff82d0802349ab>] syscall_enter+0xeb/0x145
(XEN) [2015-04-17 15:00:55.954]
(XEN) [2015-04-17 15:00:57.023]
(XEN) [2015-04-17 15:00:57.032] ****************************************
(XEN) [2015-04-17 15:00:57.051] Panic on CPU 5:
(XEN) [2015-04-17 15:00:57.064] Xen BUG at iommu_map.c:455
(XEN) [2015-04-17 15:00:57.079] ****************************************
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |