[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Xen on ARM] Possible unhandled SGI bug.
On 04/29/2013 10:39 AM, Ian Campbell wrote: > On Sun, 2013-04-28 at 20:02 +0100, Sander Bogaert wrote: >> Hi, >> >> all previous information can be found in this thread: >> http://lists.xen.org/archives/html/xen-devel/2013-04/msg02772.html >> >> I've been trying to reproduce this behaviour for the last 2 days, >> crashme has been running on the Arndale board for a total of at least >> 20 hours. I restarted the process once in a while with the seed I saw >> crashing Xen ( 'crashme +2000.4 666 50 2:00:00 2' ). >> >> The version of crashme is 2.4, the one from the Debian Wheezy >> repository. The last seed logged ( needs a SD card write so I don't >> know when the last sync was before the crash ) was 43166 >> >> I have not been able to reproduce the crash. However I'm quite sure I >> wasn't imagining things, I really did see Xen crash with the "SGI 2 >> Unhandled" error when I was running crashme from dom0 userspace. > > It could be that running crashme was just incidental, and the crash just > happened independently. There really ought to be no way for a guest to > directly generate a host level SGI and certainly no way for it to > generate one with a number of its choosing. > >> This seems like a big deal and not being able to reproduce it is kind >> of frustrating. So I was wondering if there were any ideas on how this >> could have happened? When it did happend I just rebooted the board so >> it was in a 'clean' state. >> >> Maybe some speculations on a cause could help me reproduce it? A small >> explanation on when exactly it should issue sgi's? I would really >> really like to get to the bottom of this :-) > > The xen.git hypervisor uses two SGIs, GIC_SGI_EVENT_CHECK (==0) and > GIC_SGI_DUMP_STATE (==1). Both are issued only via calls to one of > send_SGI_{mask,self,allbutself} (or their various wrappers). In practice > this means smp_send_event_check_mask() or smp_send_state_dump(). You can > verify this by looking at callchains lead to one of the small number of > writes to GICD[GICD_SGIR]. > > Julien added a new SGI in his Arndale tree to call a function on another > CPU (not sure what he called it without looking it up, it's #2 though), > this would be exercised via smp_call_function() and friends. > > About my only theory about how you can have seen a spurious host level > SGI==2 is a partial rebuild error -- i.e. make b0rked the build and you > got the new version of smp_call_function et al but not the new version > of do_sgi(). Unless of course Julien's tree temporarily had code with > that behaviour (i.e. added the smp_call stuff before the handler)? All this functionality is implemented in a single commit and I don't see this commit on you tree (commit 5ce4118f5768c6137d58888d57972bdfdf4c9aba). GIC_SGI_CALL_FUNCTION is called by on_selected_cpus which is used for: - halt a physical cpu - gdb - read clocks keyhandler -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |