On 23/11/13 20:09, Steven Haigh wrote: 
    > On 24/11/13 07:03, Andrew
      Cooper wrote: 
      >> On 23/11/13 19:56, Steven Haigh wrote: 
      >>> On 24/11/13 06:38, Steven Haigh wrote: 
      >>>> On 24/11/13 06:27, Olaf Hering wrote: 
      >>>>> On Sun, Nov 24, Steven Haigh wrote: 
      >>>>> 
      >>>>>> Running Xen 4.2.3 with all the current
      XSA fixes. 
      >>>>> 
      >>>>> How exactly did you start the guests? 
      >>>> 
      >>>> The DomUs were started with: xl create
      /etc/xen/<configfile> 
      >>>> 
      >>>>> Does 'ps faxu' show qemu processes for the
      listed domain_ids? 
      >>>>> What is the 'xenstore-ls -f | sort' output? 
      >>>> 
      >>>> I'll have to check this when I manage to
      reproduce it. So far, I have 
      >>>> been unable to get a reliable way to reproduce
      it. I managed to get a 
      >>>> system to do it every time a HVM DomU was
      shutdown OR restarted - but 
      >>>> after a reboot of the Dom0 I can't get it into
      that state again. 
      >>>> 
      >>>> As soon as I can get a system in this state
      again, I'll leave it to see 
      >>>> what information I can extract. 
      >>> 
      >>> Ha! As always, as soon as I send this, I notice its
      happened on a Dom0. 
      >>> 
      >>> # xl list 
      >>> Name                                        ID   Mem
      VCPUs      State 
      >>> Time(s) 
      >>> Domain-0                                     0 
      1579     2     r----- 
      >>>  2731.3 
      >>> planner.vm                                   1 
      1013     1     -b---- 
      >>>   189.3 
      >>> (null)                                       2    
      0     1     --psrd 
      >>>   301.1 
      >>> tracker.vm                                   3 
      1013     2     -b---- 
      >>>   834.4 
      >>> 
      >>> Attached is the output of: 
      >>> # xl debug-keys q 
      >>> # xl dmesg  > xen-dmesg.log 
      >>> # gzip xen-dmesg.log 
      >> 
      >> Ok - from dmesg. 
      >> 
      >> (XEN) General information for domain 2: 
      >> (XEN)     refcnt=1 dying=2 pause_count=2 
      >> (XEN)     nr_pages=2 xenheap_pages=0 shared_pages=0
      paged_pages=0 
      >> dirty_cpus={} max_pages=262400 
      >> (XEN)     handle=ef58ef1a-784d-4e59-8079-42bdee87f219
      vm_assist=00000000 
      >> (XEN)     paging assistance: hap refcounts translate
      external 
      >> ... 
      >> (XEN) Memory pages belonging to domain 2: 
      >> (XEN)     DomPage 00000000000866e0: caf=00000001,
      taf=0000000000000000 
      >> (XEN)     DomPage 00000000000866e1: caf=00000001,
      taf=0000000000000000 
      >> (XEN)     PoD entries=0 cachesize=0 
      >> 
      >> 
      >> So there are indeed two outstanding pages causing this
      domain to become 
      >> a zombie.  They are normal pages, with 1 outstanding ref. 
      >> 
      >> Can you collect "xl debug-keys g" as well? 
      > 
      > Sure - attached. 
     
    (XEN)       -------- active --------       -------- shared -------- 
    (XEN) [ref] localdom mfn      pin          localdom gmfn     flags 
    (XEN) grant-table for remote domain:    2 (v1) 
    (XEN) [16302]        0 0x0866e1 0x00000001          0 0x0064e1 0x19 
    (XEN) [16320]        0 0x0866e0 0x00000001          0 0x0064e0 0x19 
     
    Ok - so domain 2 has two outstanding grants.  This explains why it
    is a zombie. 
     
    Both these grants are GFT_writing | GFT_reading | GFT_permit_access,
    but seemingly unmapped. 
     
    I will have to defer to someone who knows the grant code better.  Is
    it possible for a domain to be a zombie just because it has two
    grants it hasn't manually invalidated? 
     
    ~Andrew 
     
  
 |