[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 58821: tolerable FAIL



On 22/06/15 16:36, Jan Beulich wrote:
>>>> On 22.06.15 at 17:17, <ian.campbell@xxxxxxxxxx> wrote:
>> On Mon, 2015-06-22 at 14:09 +0000, osstest service user wrote:
>>> flight 58821 xen-unstable real [real]
>>> http://logs.test-lab.xenproject.org/osstest/logs/58821/ 
>>>
>> [...]
>>>  test-amd64-amd64-libvirt     11 guest-start                  fail   like 
>> 58789
>>
>> http://logs.test-lab.xenproject.org/osstest/logs/58821/test-amd64-amd64-libv 
>> irt/info.html
>>
>> While investigating why libvirt hasn't been succeeding very well on
>> merlot* I came across some things in the serial log which initially
>> struck me as odd, but which I suspect are nothing (or at least not
>> terribly relevant), if someone could confirm that would be great.
>>
>> Firstly is:
>>
>> Jun 22 12:41:09.633294 (XEN) microcode: CPU2 updated from revision 0x6000822 
>> to 0x6000832
>> Jun 22 12:41:09.665099 (XEN) microcode: CPU4 updated from revision 0x6000822 
>> to 0x6000832
>> Jun 22 12:41:09.729089 (XEN) microcode: CPU6 updated from revision 0x6000822 
>> to 0x6000832
>> [...]
>>
>> i.e. only even numbered cpus are updated. (0 was done earlier in boot).
>> I suspect that the answer here is "hyperthreading", and the cpuinfo
>> shows all cpus have in fact been updated.
> Yes (albeit hyperthreading is an Intel term, but iirc the same applies
> to the two cores per compute unit).

Indeed.  The "microcode: patch is already at required level or
greater.\n" message is helpfully unconditionally compiled out.

>
>> The second thing is:
>> Jun 22 12:41:10.601103 (XEN) Brought up 32 CPUs
>> Jun 22 12:41:10.625270 (XEN) Testing NMI watchdog on all CPUs: 0 1 2 3 4 5 6 
>> 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 stuck
>>
>> i.e. at least one CPU has issues with NMI watchdog (looking at other
>> runs it seems to vary between 29-31). Is this just that the NMI watchdog
>> doesn't deal well with so many pCPUs? Or is it a real issue?
> Very few CPUs properly responding is certainly quite odd; one
> would expect all or none of them to work. Perhaps our AMD
> maintainers (now Cc-ed) could take a look...

There are several things wrong with the NMI testing in Xen atm,
following some recent investigation in XenServer.  Time isn't accounted
properly for cores under bios/hardware power control, and Xen doesn't
wait for the requisite time even if the core were running at its
expected frequency.

I should see about making those patches appear, but for now, ignore this
line.  It is more than likely wrong.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.