[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it
On 30/01/18 18:46, Julien Grall wrote: > Hi, > > On 30/01/18 18:29, Andrew Cooper wrote: >> On 30/01/18 17:00, Julien Grall wrote: >>> >>> >>> On 30/01/18 16:38, Andrew Cooper wrote: >>>> On 30/01/18 16:14, Julien Grall wrote: >>>>> Hi all, >>>>> >>>>> This small series replaces all call to domain_crash_synchronous by >>>>> injecting >>>>> an exception to the guest. >>>>> >>>>> This will result to a nicer trace from the guest (no need to >>>>> manually walk >>>>> the stack) and give a chance to the guest to give a bit more >>>>> information on >>>>> what it was doing. >>>>> >>>>> Cheers, >>>>> >>>>> Julien Grall (3): >>>>> xen/arm: io: Distinguish unhandled IO from aborted one >>>>> xen/arm: Don't crash domain on bad MMIO emulation >>>>> xen/arm: Don't crash the domain on invalid HVC immediate >>>> >>>> Thanks. >>>> >>>> I don't feel qualified to review these, but some notes. >>>> >>>> Patch 1. s/avodi/avoid/ in the commit message >>>> >>>> Patches 2 and 3. You probably want to convert the printks to >>>> gdprintk()s, otherwise guests can choke up the ratelimited log. Doing >>>> so will also mean that the vcpu will be identified consistently, which >>>> it isn't currently. >>> We didn't use g*printk because it would be more confusing to print the >>> current vCPU in some cases (e.g when accessing the re-distributor of >>> another vCPU) or does not matter (e.g for ITS). >> >> In the former case, you'd want to print both current, and the target >> vcpu. The latter still matters what current is if something goes wrong. >> >> We have plenty of similar cases in x86, but at the point you are >> printing an diagnostic message, ignoring current is almost always the >> wrong think to do. > > I will look at it on another series. Fair enough. > >> >>> >>> The problem with the debug version is those information are actually >>> quite useful in non-debug build. We found quite a few issues thanks to >>> them. >>> >>> I think it would make more sense for Xen to provide per-guest >>> ratelimited than hiding those messages in non-debug build. >> >> Per guest is quite a lot more complicated than global, and would still >> require a global limit to prevent a concerted attack from multiple >> guests to avoid DoSing the system. >> >> Debug vs unilateral is your prerogative as a maintainer, but as you've >> said yourself, the are used for debugging purposes, which proves my >> point. > > So on x86, you always request the user to reproduce it with debug > build enable? That is very specific to what goes wrong, but no, we generally don't have release-build messages for "the guest screwed up and we hit it expected way for doing so". We do (well - are trying to, which is how I found this domain_crash_sync issue in the first place) get consistent at printing useful release-build messages for abnormal termination of the guest. In terms of making it easier to debug, XenServer always ships with a release and debug hypervisor built from the same source so customers can trivially switch into the debug version and collect logs if we think that is a useful thing to do, but this is only used in a fraction of cases in the first place. Anything more complicated is definitely going to require an experienced human to investigate, at which point they can do whatever they want. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |