WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Need help with fixing the Xen waitqueue feature

To: Keir Fraser <keir.xen@xxxxxxxxx>
Subject: Re: [Xen-devel] Need help with fixing the Xen waitqueue feature
From: Olaf Hering <olaf@xxxxxxxxx>
Date: Tue, 8 Nov 2011 23:20:11 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 08 Nov 2011 14:21:08 -0800
Dkim-signature: v=1; a=rsa-sha1; c=relaxed/relaxed; t=1320790833; l=1951; s=domk; d=aepfle.de; h=In-Reply-To:Content-Type:MIME-Version:References:Subject:Cc:To:From: Date:X-RZG-CLASS-ID:X-RZG-AUTH; bh=ECa2OemGz1TamOHFP2bHOKSX8Wo=; b=T7e49dCr2c4nW/hyv3YGbHR4UclBc8PUQ7ZjXZodatuQ7XEyy+zFwYxFKKWOCQL/KhZ fsN5zzNaHEH9gHV25l9f74Um4NwDe9/fTyGhzcpUVffnhHTQZ1XIElmaZ2w1UHfOSroDc lziKMbluSY6ZIpVKtF8LLnfAvGzwJO5uf3M=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CADF5835.245E1%keir.xen@xxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20111108212024.GA5276@xxxxxxxxx> <CADF5835.245E1%keir.xen@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.21.rev5535 (2011-07-01)
On Tue, Nov 08, Keir Fraser wrote:

> On 08/11/2011 21:20, "Olaf Hering" <olaf@xxxxxxxxx> wrote:
> 
> > Another thing is that sometimes the host suddenly reboots without any
> > message. I think the reason for this is that a vcpu whose stack was put
> > aside and that was later resumed may find itself on another physical
> > cpu. And if that happens, wouldnt that invalidate some of the local
> > variables back in the callchain? If some of them point to the old
> > physical cpu, how could this be fixed? Perhaps a few "volatiles" are
> > needed in some places.
> 
> From how many call sites can we end up on a wait queue? I know we were going
> to end up with a small and explicit number (e.g., in __hvm_copy()) but does
> this patch make it a more generally-used mechanism? There will unavoidably
> be many constraints on callers who want to be able to yield the cpu. We can
> add Linux-style get_cpu/put_cpu abstractions to catch some of them. Actually
> I don't think it's *that* common that hypercall contexts cache things like
> per-cpu pointers. But every caller will need auditing, I expect.

I havent started to audit the callers. In my testing
mem_event_put_request() is called from p2m_mem_paging_drop_page() and
p2m_mem_paging_populate(). The latter is called from more places.

My plan is to put the sleep into ept_get_entry(), but I'm not there yet.
First I want to test waitqueues in a rather simple code path like
mem_event_put_request().

> A sudden reboot is very extreme. No message even on a serial line? That most
> commonly indicates bad page tables. Most other bugs you'd at least get a
> double fault message.

There is no output on serial, I boot with this cmdline:
  vga=mode-normal console=com1 com1=57600 loglvl=all guest_loglvl=all
  sync_console conring_size=123456 maxcpus=8 dom0_vcpus_pin
  dom0_max_vcpus=2
My base changeset is 24003, the testhost is a Xeon X5670  @ 2.93GHz.

Olaf

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel