[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL



On Mon, Jul 25, 2016 at 12:34:53PM +0100, Julien Grall wrote:
> 
> 
> On 25/07/16 12:11, Wei Liu wrote:
> >Thanks for investigating.
> >
> >There are only two arm related changes in the range being tested:
> >
> >* a43cc8f - (origin/smoke) arm/traps: fix bug in dump_guest_s1_walk handling 
> >of level 2 page tables (5 days ago) <Jonathan Daugherty>
> >* 60e06f2 - arm/traps: fix bug in dump_guest_s1_walk L1 page table offset 
> >computation (5 days ago) <Jonathan Daugherty>
> >
> >They don't look very suspicious.
> 
> The modified function is not called in the hypervisor at all. It's only here
> for manual debugging.
> 
> Although, this may change the offset of some function (assuming we have an
> hidden bug).
> 
> >If you need help navigating osstest test report, please let me know.
> 
> I have noticed that there is 2 kernel BUG in the logs (with one host reboot
> in the middle). Can you detail what the exact the test?

What I normally do is to look at the summary page of the failed test to
identify the failed step and the time.

In this case:

http://logs.test-lab.xenproject.org/osstest/logs/97737/test-armhf-armhf-xl/info.html

The time stamp said the failed step started at 2016-07-21 19:30:10 Z,
and then I look at the output of failed step log to look for time stamp
that the test failed.  Then I would look for output between these two
time stamps in various log.

I now realise the log I pasted in was not from the failed test. I wanted
to paste in the second kernel oops, which should be the culprit that
the test failed. The two oops was the same one, though.

To identify  which test step was running when the first oops happened,
the same technique applies.

It seems that the oops happened during ts-debian-install according to
time stamps.

Wei.

> 
> It looks to me that you are trying to power cycle multiple time a guest.
> 
> Cheers,
> 
> >Wei.
> >
> >
> >On Mon, Jul 25, 2016 at 12:05:08PM +0100, Julien Grall wrote:
> >>Hi Wei,
> >>
> >>On 25/07/16 09:53, Wei Liu wrote:
> >>>On Fri, Jul 22, 2016 at 03:27:30AM +0000, osstest service owner wrote:
> >>>>flight 97737 xen-unstable real [real]
> >>>>http://logs.test-lab.xenproject.org/osstest/logs/97737/
> >>>>
> >>>>Regressions :-(
> >>>>
> >>>>Tests which did not succeed and are blocking,
> >>>>including tests which could not be run:
> >>>>test-armhf-armhf-xl          15 guest-start/debian.repeat fail REGR. vs. 
> >>>>97664
> >>>
> >>>From
> >>>
> >>>\
> >>>
> >>>
> >>>Jul 21 17:08:59.405183 [ 4479.814529] ------------[ cut here ]------------
> >>>
> >>>Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at 
> >>>drivers/xen/grant-table.c:923!
> >>>
> >>>Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] 
> >>>SMP ARM
> >>>
> >>>Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc 
> >>>bridge stp ipv6 llc brcmfmac brcmutil cfg80211
> >>>
> >>>Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: 
> >>>vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1
> >>>
> >>>Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: 
> >>>d85bc000
> >>>
> >>>Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4
> >>>
> >>>Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4
> >>>
> >>>Jul 21 17:09:17.002718 [ 4479.814888] pc : [<c04bb190>]    lr : 
> >>>[<c04bb0dc>]    psr: a0070013
> >>>
> >>>Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0  ip : deadbeef  fp : 
> >>>c0c8e140
> >>>
> >>>Jul 21 17:09:17.014341 [ 4479.814935] r10: 00000000  r9 : e1bec000  r8 : 
> >>>00000000
> >>>
> >>>Jul 21 17:09:17.019595 [ 4479.814960] r7 : 00000002  r6 : 00000002  r5 : 
> >>>d85bdf20  r4 : e1bf4d30
> >>>
> >>>Jul 21 17:09:17.026095 [ 4479.814990] r3 : 00000001  r2 : deadbeef  r1 : 
> >>>deadbeef  r0 : fffffff2
> >>>
> >>>Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv  IRQs on  FIQs on  Mode 
> >>>SVC_32  ISA ARM  Segment kernel
> >>>
> >>>Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d  Table: 78d8406a  
> >>>DAC: 00000015
> >>>
> >>>Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, 
> >>>stack limit = 0xd85bc248)
> >>>
> >>>Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000)
> >>>
> >>>Jul 21 17:09:17.057218 [ 4479.815145] dea0: 00000001 d8b11388 d85bdf20 
> >>>d85bdf04 00000002 c05eb054 00000388 00000000
> >>>
> >>>Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04 00000000 00000000 
> >>>c0b7ea80 db0995c0 c05e86e4 e1bf4000 0000003c
> >>>
> >>>Jul 21 17:09:17.073753 [ 4479.815221] dee0: 00000000 00000000 00000000 
> >>>c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4
> >>>
> >>>Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80 00000000 d85bdf08 
> >>>d85bdf08 d8c5cb40 d8c5cb40 00000001 00000000
> >>>
> >>>Jul 21 17:09:17.090217 [ 4479.815298] df20: 00000002 00000000 00000001 
> >>>00000000 e1bf4d30 e1c1f530 000004c6 0000023c
> >>>
> >>>Jul 21 17:09:17.098466 [ 4479.815337] df40: 00000000 00000000 d84aab80 
> >>>e1bec000 c05ea990 00000000 00000000 00000000
> >>>
> >>>Jul 21 17:09:17.106720 [ 4479.815375] df60: 00000000 c0266238 00000000 
> >>>00000000 000000f8 e1bec000 00000000 00000000
> >>>
> >>>Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80 00000000 
> >>>00000000 d85bdf90 d85bdf90 d85bdfac d84aab80
> >>>
> >>>Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168 00000000 00000000 
> >>>c020f138 00000000 00000000 00000000 00000000
> >>>
> >>>Jul 21 17:09:17.131345 [ 4479.815489] dfc0: 00000000 00000000 00000000 
> >>>00000000 00000000 00000000 00000000 00000000
> >>>
> >>>Jul 21 17:09:17.139596 [ 4479.815527] dfe0: 00000000 00000000 00000000 
> >>>00000000 00000013 00000000 00000000 00000000
> >>>
> >>>Jul 21 17:09:17.147841 [ 4479.815583] [<c04bb190>] (gnttab_batch_copy) 
> >>>from [<c05eb054>] (xenvif_kthread_guest_rx+0x6c4/0xb58)
> >>
> >>From my understanding the hypercall can only return a non-zero value if
> >>copy_*_guest helpers fails.
> >>
> >>Those helpers will only fail when it is not possible to retrieve the page
> >>associated to a virtual address. The value is in r0 (-EFAULT), seem to
> >>confirm that. So this looks very suspicious.
> >>
> >>Looking at the other parameters and the assembly code (see [1]):
> >>    count = 2 (saved in r6)
> >>        batch = 0xe1bf4d30 (saved in r4)
> >>
> >>They looks valid to me. Also, there was no major change around that code
> >>recently.
> >>
> >>I don't have much ideas what is going on. And unfortunately Xen ARM does not
> >>print much information when the translation fail.
> >>
> >>I have CCed few more people to see if they have a clue.
> >>
> >>>
> >>>Jul 21 17:09:17.156969 [ 4479.815636] [<c05eb054>] 
> >>>(xenvif_kthread_guest_rx) from [<c0266238>] (kthread+0xd0/0xe8)
> >>>
> >>>Jul 21 17:09:17.165217 [ 4479.815681] [<c0266238>] (kthread) from 
> >>>[<c020f138>] (ret_from_fork+0x14/0x3c)
> >>>
> >>>Jul 21 17:09:17.172467 [ 4479.815721] Code: e1c432b4 eaffffe0 e7f001f2 
> >>>e8bd80f8 (e7f001f2)
> >>>
> >>>Jul 21 17:09:17.178595 [ 4479.815766] ---[ end trace 6ba7d172d52e24e2 ]---
> >>>
> >>
> >>Regards,
> >>
> >>[1] 
> >>http://logs.test-lab.xenproject.org/osstest/logs/97737/build-armhf-pvops/info.html
> >>
> >>c04bb0c0 <gnttab_batch_copy>:
> >>c04bb0c0:       e92d40f8        push    {r3, r4, r5, r6, r7, lr}
> >>c04bb0c4:       e1a02001        mov     r2, r1
> >>c04bb0c8:       e1a04000        mov     r4, r0
> >>c04bb0cc:       e1a06001        mov     r6, r1
> >>c04bb0d0:       e1a01000        mov     r1, r0
> >>c04bb0d4:       e3a00005        mov     r0, #5
> >>c04bb0d8:       ebf54e39        bl      c020e9c4 <HYPERVISOR_grant_table_op>
> >>c04bb0dc:       e3500000        cmp     r0, #0
> >>c04bb0e0:       1a00002a        bne     c04bb190 <gnttab_batch_copy+0xd0>
> >>
> >>--
> >>Julien Grall
> >
> 
> -- 
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.