WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] remus failure -xen 4.0.1: xc_domain_restore cannot pin p

To: Shriram Rajagopalan <rshriram@xxxxxxxxx>
Subject: Re: [Xen-devel] remus failure -xen 4.0.1: xc_domain_restore cannot pin page tables
From: Bruce Edge <bruce.edge@xxxxxxxxx>
Date: Wed, 17 Nov 2010 09:13:02 -0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 17 Nov 2010 09:13:34 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=ZIN1/o1YPO32Sh+j2d1hw8nYDxSbtYXRPlrMe8Osnw0=; b=urbY79NuBaXa2IiDunW3RTmlbfaU8uOJHIN2J3z4UYbCR6Xd3Hl7CJUexGhVe7Kew1 lKDcRwlC5I77HGKdKhQGJP57zbU7wjUYDzLQjGM7WmG4NG7aR/7gE5Fu8YrbsTIB886d tXwS29BmflnMTMYyutOpx/tdG0MBjLqKr+I+g=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=Ebh/Y70CG5cCqfZfIgLt/dr3EkIMMJBwoNdCaPG81noxVg6IyicZi8s21ccy8uXMJH a2960SgPD669d4eqEZdsGEzs9xCxJv4lKQ4J2l0nqx/KXQ8/qiBZuvG5BSEyYzacFWvw 7nxYnFHJvyK0OmuxfAeRDt3I1yubYczAXWwJA=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTin2pbMH0cK_hVZJUOq_sUeUy4ChJBLmQ8N2w47U@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTin2pbMH0cK_hVZJUOq_sUeUy4ChJBLmQ8N2w47U@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Mon, Sep 6, 2010 at 10:56 PM, Shriram Rajagopalan <rshriram@xxxxxxxxx> wrote:
>
> Hardware: Dell Poweredge R510 (32G ram, 8 CPU- Xeon)
>
> 64bit - xen 4.0.1 stable
>
> 64bit - 2.6.32.18 dom0 (.config attached) running Ubuntu 10.04
> 32 bit - 2.6.18.8 domU (.config attached) running ubuntu 8.04
>
> domU has 3 tap2 disks, on lvm snapshots.
>  domU has 2G mem, 2 VCPU
>
> workload on domU - ssh + top running, destroy domain -- This works .
>
> But, If i run a heavier workload say postgres db (just starting db, no
> queries), remus fails to recover. Note that this is not spurious timeout
> error.
>  On destroying the vm on primary, the backup fails to recover the vm with
> the following error in xm dmesg:
>
> (XEN) mm.c:779:d0 Bad L1 flags 98
> (XEN) mm.c:1186:d0 Failure in alloc_l1_table: entry 1
> (XEN) mm.c:2117:d0 Error while validating mfn 4101af (pfn 2cc08) for type
> 1000000000000000: caf=8000000000000003 taf=1000000000000001
> (XEN) mm.c:868:d0 Attempt to create linear p.t. with write perms
> (XEN) mm.c:1330:d0 Failure in alloc_l2_table: entry 113
> (XEN) mm.c:2117:d0 Error while validating mfn 40fc4c (pfn 2d1ce) for type
> 2000000000000000: caf=8000000000000003 taf=2000000000000001
> (XEN) mm.c:1440:d0 Failure in alloc_l3_table: entry 2
> (XEN) mm.c:2117:d0 Error while validating mfn 40fcdf (pfn 2d08d) for type
> 3000000000000000: caf=8000000000000003 taf=3000000000000001
> (XEN) mm.c:2733:d0 Error while pinning mfn 40fcdf
> ============
>
> Error in xend.log @ backup
> -----------------------------
> [2010-09-06 21:38:16 2392] DEBUG (XendDomainInfo:1804) Storing domain
> details: {'image/entry': '3222274048', 'console/port': '2', 'image/loader':
> 'generic',
> 'vm': '/vm/7be5f9bf-da53-6c10-d4e5-330940210966',
> 'control/platform-feature-multiprocessor-suspend': '1',
> 'image/hv-start-low': '4118806528', 'image/guest-os
> ': 'linux', 'cpu/1/availability': 'online',
> 'image/features/writable-descriptor-tables': '1', 'image/virt-base':
> '3221225472', 'memory/target': '2048000', 'i
> mage/guest-version': '2.6', 'image/features/supervisor-mode-kernel': '1',
> 'image/pae-mode': 'yes', 'description': '', 'console/limit': '1048576',
> 'image/padd
> r-offset': '3221225472', 'image/hypercall-page': '3222278144',
> 'image/suspend-cancel': '1', 'cpu/0/availability': 'online',
> 'image/features/pae-pgdir-above-4
> gb': '1', 'image/features/writable-page-tables': '1', 'console/type':
> 'xenconsoled', 'image/features/auto-translated-physmap': '1', 'name':
> 'tpccExpt-remus',
>  'domid': '6', 'image/xen-version': 'xen-3.0', 'store/port': '1'}
> [2010-09-06 21:38:16 2392] DEBUG (XendCheckpoint:286) restore:shadow=0x0,
> _static_max=0x7d000000, _static_min=0x0,
> [2010-09-06 21:38:16 2392] DEBUG (XendCheckpoint:305) [xc_restore]:
> /usr/lib/xen/bin/xc_restore 4 6 1 2 0 0 0 0
> [2010-09-06 21:38:16 2392] INFO (XendCheckpoint:423) xc_domain_restore
> start: p2m_size = 7d000
> [2010-09-06 21:38:16 2392] INFO (XendCheckpoint:423) Reloading memory
> pages:   0%
> [2010-09-06 21:40:24 2392] INFO (XendCheckpoint:423) ERROR Internal error:
> Error when reading batch size
> [2010-09-06 21:40:24 2392] INFO (XendCheckpoint:423) ERROR Internal error:
> error when buffering batch, finishing
> [2010-09-06 21:40:24 2392] INFO (XendCheckpoint:423)
> [2010-09-06 21:40:24 2392] INFO (XendCheckpoint:423) ERROR Internal error:
> Failed to pin batch of 18 page tables (22 = Invalid argument)
> [2010-09-06 21:40:25 2392] INFO (XendCheckpoint:423) Restore exit with rc=1
>
> the number of page tables falling under the error category also varies
> (16,18,20)...
> =============

I'm seeing this too. Here's my config:

xen unstable - 22395:deb438d43e79 Tue Nov 16 15:41:28 2010 +0000
dom0 - xen/stable-2.6.32.x a504ac446b2ca0d308000bdf5a3b96b2afd79261
Thu Aug 12 10:51:38
domU - mainline 2.6.37-rc2

-Bruce



>
>
> xm info output (stripped)
> machine                : x86_64
> nr_cpus                : 8
> nr_nodes               : 2
> cores_per_socket   : 4
> threads_per_core    : 1
> cpu_mhz                : 2133
> hw_caps                 :
> bfebfbff:28100800:00000000:00001b40:009ce3bd:00000000:00000001:00000000
> virt_caps                : hvm hvm_directio
> total_memory           : 32758
> free_memory            : 28985
> node_to_cpu            : node0:0,2,4,6
>                          node1:1,3,5,7
> node_to_memory         : node0:12731
>                          node1:16254
> node_to_dma32_mem      : node0:0
>                          node1:2993
> max_node_id            : 1
> xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
> hvm-3.0-x86_32p hvm-3.0-x86_64
> xen_scheduler          : credit
> xen_pagesize           : 4096
> platform_params        : virt_start=0xffff800000000000
> xen_commandline        : dummy=dummy dom0_mem=4096M
> cc_compiler            : gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
> xend_config_format     : 4
> ------------------------------------------------------
>
> I need the 2.6.18 domU because of the suspend event channel support.
>
> --
> perception is but an offspring of its own self
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>