[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97



Monday, February 23, 2015, 11:06:25 AM, you wrote:

>>>> On 23.02.15 at 10:27, <linux@xxxxxxxxxxxxxx> wrote:
>> While shutting down all guests to go for a host reboot i encountered the 
>> splat below.
>> This was running on Xen with:
>> xen_changeset: Fri Feb 20 16:21:10 2015 +0100 git:24b2b8d-dirty

> "-dirty" meaning what?
Patch for re-enabeling HPET, which doesn't get enabled due to a bios glitch, 
but 
actually just works fine (for over a year now or so). 
(and if it's not enabled, cpuidle breaks bad)

diff --git a/xen/drivers/passthrough/amd/iommu_intr.c 
b/xen/drivers/passthrough/amd/iommu_intr.c
index c1b76fb..43435bc 100644
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -608,7 +608,7 @@ int __init amd_setup_hpet_msi(struct msi_desc *msi_desc)
     {
         AMD_IOMMU_DEBUG("Failed to setup HPET MSI remapping."
                         " Wrong HPET.\n");
-        return -ENODEV;
+       /* return -ENODEV; */
     }

     lock = get_intremap_lock(hpet_sbdf.seg, hpet_sbdf.bdf);



And the other one is Konrad's temp fix for the dpci softirq problem:

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index ae050df..ed3cfa1 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -804,7 +804,19 @@ static void dpci_softirq(void)
         d = pirq_dpci->dom;
         smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
         if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
-            BUG();
+        {
+           unsigned long flags;
+
+            /* Put back on the list and retry. */
+            local_irq_save(flags);
+           list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
+            local_irq_restore(flags);
+
+            raise_softirq(HVM_DPCI_SOFTIRQ);
+            continue;
+       }
+
+
         /*
          * The one who clears STATE_SCHED MUST refcount the domain.
          */


>> (XEN) [2015-02-23 09:16:26.292] Assertion 'cpu < nr_cpu_ids' failed at 
>> .../src/new/xen-unstable/xen/include/xen/cpumask.h:97

> Since with debug=y the callstack entries should be reliable, I can't
> see how this matches up with ...

>> (XEN) [2015-02-23 09:16:26.292] Xen call trace:
>> (XEN) [2015-02-23 09:16:26.292]    [<ffff82d08012c018>] 
>> cpu_raise_softirq+0xd7/0xeb

> ... this, since

> void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
> {
>     unsigned int this_cpu = smp_processor_id();

>     if ( test_and_set_bit(nr, &softirq_pending(cpu))
>          || (cpu == this_cpu)
>          || arch_skip_send_event_check(cpu) )
>         return;

>     if ( !per_cpu(batching, this_cpu) || in_irq() )
>         smp_send_event_check_cpu(cpu);
>     else
>         set_bit(nr, &per_cpu(batch_mask, this_cpu));
> }

> doesn't indicate any use of cpumask functions. If, however,
> arch_skip_send_event_check()'s call to cpumask_test_cpu()
> didn't get inlined, that might be the cause. Albeit that would mean
> smp_processor_id() returned an out-of-range value... In any
> event we'll need to know what exactly above code location refers
> to inside the entire function.

Any instructions on how to figure that out ?

--
Sander
> Jan




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.