[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Xen on ARM] Possible unhandled SGI bug.
On 29-04-13 14:27, Julien Grall wrote: > On 04/29/2013 10:39 AM, Ian Campbell wrote: > >> On Sun, 2013-04-28 at 20:02 +0100, Sander Bogaert wrote: >>> Hi, >>> >>> all previous information can be found in this thread: >>> http://lists.xen.org/archives/html/xen-devel/2013-04/msg02772.html >>> >>> >>> I've been trying to reproduce this behaviour for the last 2 days, >>> crashme has been running on the Arndale board for a total of at >>> least 20 hours. I restarted the process once in a while with >>> the seed I saw crashing Xen ( 'crashme +2000.4 666 50 2:00:00 >>> 2' ). >>> >>> The version of crashme is 2.4, the one from the Debian Wheezy >>> repository. The last seed logged ( needs a SD card write so I >>> don't know when the last sync was before the crash ) was 43166 >>> >>> I have not been able to reproduce the crash. However I'm quite >>> sure I wasn't imagining things, I really did see Xen crash with >>> the "SGI 2 Unhandled" error when I was running crashme from >>> dom0 userspace. >> >> It could be that running crashme was just incidental, and the >> crash just happened independently. There really ought to be no >> way for a guest to directly generate a host level SGI and >> certainly no way for it to generate one with a number of its >> choosing. >> >>> This seems like a big deal and not being able to reproduce it >>> is kind of frustrating. So I was wondering if there were any >>> ideas on how this could have happened? When it did happend I >>> just rebooted the board so it was in a 'clean' state. >>> >>> Maybe some speculations on a cause could help me reproduce it? >>> A small explanation on when exactly it should issue sgi's? I >>> would really really like to get to the bottom of this :-) >> >> The xen.git hypervisor uses two SGIs, GIC_SGI_EVENT_CHECK (==0) >> and GIC_SGI_DUMP_STATE (==1). Both are issued only via calls to >> one of send_SGI_{mask,self,allbutself} (or their various >> wrappers). In practice this means smp_send_event_check_mask() or >> smp_send_state_dump(). You can verify this by looking at >> callchains lead to one of the small number of writes to >> GICD[GICD_SGIR]. >> >> Julien added a new SGI in his Arndale tree to call a function on >> another CPU (not sure what he called it without looking it up, >> it's #2 though), this would be exercised via smp_call_function() >> and friends. >> >> About my only theory about how you can have seen a spurious host >> level SGI==2 is a partial rebuild error -- i.e. make b0rked the >> build and you got the new version of smp_call_function et al but >> not the new version of do_sgi(). Unless of course Julien's tree >> temporarily had code with that behaviour (i.e. added the smp_call >> stuff before the handler)? > > All this functionality is implemented in a single commit and I > don't see this commit on you tree (commit > 5ce4118f5768c6137d58888d57972bdfdf4c9aba). > > GIC_SGI_CALL_FUNCTION is called by on_selected_cpus which is used > for: - halt a physical cpu - gdb - read clocks keyhandler > I understand I'm using an older version. The reason I'm still using it is because I hope to reproduce this. I really don't think I 'b0rked' my build, it's a clean pull & build. So if sgi 2 was sent it wasn't because of this functionality. I will rerun the test from time to time maybe it pops up again. Sander _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |