[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 116178: regressions - FAIL




On 11/15/2017 04:46 PM, Julien Grall wrote:
> Hi,
> 
> On 11/15/2017 11:29 AM, osstest service owner wrote:
>> flight 116178 xen-unstable real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/116178/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>    test-armhf-armhf-libvirt-raw 15 guest-start/debian.repeat fail REGR. vs. 
>> 116161
> 
> The kernel is hitting a BUG() in gnttab_batch_copy() (see stack trace). This 
> seems to
> be because GNTTABOP_copy is failing. Looking at the serial log, this seems to 
> happen
> time to time on the Arndale (not on the cubietruck) with different version of 
> the kernel.
> 
> I have reported a similar error last year ([1]), and still have no clue why 
> page-table
> translation might fail...
> 
> I am going to send a patch adding a bit more debug in the function doing the
> translation from the guest PA to the host PA. Hopefully, it might tell us a 
> bit more
> what's going on.

We finally had a repro on the 1st of December with the patch applied (see log 
below).

From the log:

gvirt_to_maddr failed va=0xe1610e34 flags=0x1 par=0x80b

This is a stage-1 translation fault level 1, and it happens when trying to copy
data to the guest (flags=1).

This is where it is get confusing, if I got it correct, GNTTABOP_copy will 
return
-EFAULT only in 2 occasions:
    1) If copying the operation from the guest fail
    2) If copying the status to the guest fail

Is there any other place?

Based on the understanding above, we are in the second case. This would mean 
that
the kernel is playing with its page-table in the middle which I would find 
surprising.

I had a look at the errata for A15 r0, and found nothing promising.

So I am out of ideas what's going on. Anyone one has a hint?

Cheers,

Dec  1 10:16:48.563092 (XEN) p2m.c:1434: d0v1: gvirt_to_maddr failed 
va=0xe1610e34 flags=0x1 par=0x80b
Dec  1 10:16:51.579178 [ 2452.187462] ------------[ cut here ]------------^M
Dec  1 10:16:51.587064 [ 2452.192116] kernel BUG at 
drivers/xen/grant-table.c:770!^M
Dec  1 10:16:51.587104 [ 2452.197514] Internal error: Oops - BUG: 0 [#1] SMP 
ARM^M
Dec  1 10:16:51.595084 [ 2452.202704] Modules linked in: xen_gntalloc 
wm8994_regulator s5p_mfc snd_soc_i2s snd_soc_idma snd_soc_s3c_dma 
videobuf2_dma_contig snd_soc_core videobuf2_memops videobuf2_v4l2 
videobuf2_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus 
v4l2_common wm8994 videodev pwm_samsung rtc_s3c usb3503 media s5p_sss dwc3 
dwc3_exynos clk_s2mps11 s5m8767 phy_exynos_usb2 dw_mmc_exynos dw_mmc_pltfm 
dw_mmc phy_exynos5250_sata ohci_exynos ehci_exynos phy_exynos5_usbdrd^M
Dec  1 10:16:51.635119 [ 2452.243938] CPU: 1 PID: 4552 Comm: vif9.0-q1-guest 
Not tainted 4.9.66 #1^M
Dec  1 10:16:51.643095 [ 2452.250719] Hardware name: SAMSUNG EXYNOS (Flattened 
Device Tree)^M
Dec  1 10:16:51.651085 [ 2452.256868] task: d9e1bc80 task.stack: d69bc000^M
Dec  1 10:16:51.651119 [ 2452.261475] PC is at gnttab_batch_copy+0xd0/0xe4^M
Dec  1 10:16:51.659059 [ 2452.266156] LR is at gnttab_batch_copy+0x1c/0xe4^M
Dec  1 10:16:51.659092 [ 2452.270843] pc : [<c06de490>]    lr : [<c06de3dc>]    
psr: a00f0013^M
Dec  1 10:16:51.667088 [ 2452.270843] sp : d69bded0  ip : deadbeef  fp : 
d69bc000^M
Dec  1 10:16:51.675079 [ 2452.282488] r10: e1610df0  r9 : d7bd5cc0  r8 : 
d69bdf38^M
Dec  1 10:16:51.683123 [ 2452.287768] r7 : e1610cb8  r6 : 00000001  r5 : 
e1610cb8  r4 : e1610e10^M
Dec  1 10:16:51.683164 [ 2452.294365] r3 : 00000000  r2 : deadbeef  r1 : 
deadbeef  r0 : fffffff2^M
Dec  1 10:16:51.691074 [ 2452.300963] Flags: NzCv  IRQs on  FIQs on  Mode 
SVC_32  ISA ARM  Segment none^M
Dec  1 10:16:51.699074 [ 2452.308167] Control: 10c5387d  Table: 79adc06a  DAC: 
00000051^M
Dec  1 10:16:51.707071 [ 2452.313998] Process vif9.0-q1-guest (pid: 4552, stack 
limit = 0xd69bc220)^M
Dec  1 10:16:51.715097 [ 2452.320839] Stack: (0xd69bded0 to 0xd69be000)^M
Dec  1 10:16:51.715149 [ 2452.325268] dec0:                                     
00000000 e1607cb8 e1610cb8 c1202d00^M
Dec  1 10:16:51.723068 [ 2452.333515] dee0: e1610cb8 c08be66c 00000000 00000002 
e1607cb8 c08bef1c d6a53e40 d6a53e40^M
Dec  1 10:16:51.731086 [ 2452.341761] df00: 00000001 c08be824 e1607cb8 e1607cb8 
e1610df0 c08bf038 c0367950 c037d248^M
Dec  1 10:16:51.739118 [ 2452.350007] df20: c0b6ae54 192b8000 d99bfe58 00000000 
d9e1bc80 c037d248 d69bdf38 d69bdf38^M
Dec  1 10:16:51.747073 [ 2452.358263] df40: c08bef24 00000000 d76ac040 e1607cb8 
c08bef24 00000000 00000000 00000000^M
Dec  1 10:16:51.755076 [ 2452.366498] df60: 00000000 c035f810 c131ba84 00000000 
00000001 e1607cb8 00000000 00000000^M
Dec  1 10:16:51.763095 [ 2452.374744] df80: d69bdf80 d69bdf80 00000000 00000000 
d69bdf90 d69bdf90 d69bdfac d76ac040^M
Dec  1 10:16:51.771111 [ 2452.382990] dfa0: c035f714 00000000 00000000 c0308838 
00000000 00000000 00000000 00000000^M
Dec  1 10:16:51.779098 [ 2452.391247] dfc0: 00000000 00000000 00000000 00000000 
00000000 00000000 00000000 00000000^M
Dec  1 10:16:51.787102 [ 2452.399482] dfe0: 00000000 00000000 00000000 00000000 
00000013 00000000 00000000 00000000^M
Dec  1 10:16:51.803058 [ 2452.407736] [<c06de490>] (gnttab_batch_copy) from 
[<c08be66c>] (xenvif_rx_copy_flush+0x1c/0x12c)^M
Dec  1 10:16:51.811047 [ 2452.416585] [<c08be66c>] (xenvif_rx_copy_flush) from 
[<c08bef1c>] (xenvif_rx_action+0x54/0x5c)^M
Dec  1 10:16:51.819103 [ 2452.425264] [<c08bef1c>] (xenvif_rx_action) from 
[<c08bf038>] (xenvif_kthread_guest_rx+0x114/0x2a4)^M
Dec  1 10:16:51.827149 [ 2452.434380] [<c08bf038>] (xenvif_kthread_guest_rx) 
from [<c035f810>] (kthread+0xfc/0x114)^M
Dec  1 10:16:51.835236 [ 2452.442627] [<c035f810>] (kthread) from [<c0308838>] 
(ret_from_fork+0x14/0x3c)^M
Dec  1 10:16:51.843187 [ 2452.449915] Code: e1c432b4 eaffffe0 e7f001f2 e8bd80f8 
(e7f001f2) ^M
Dec  1 10:16:51.851065 [ 2452.456080] ---[ end trace 69d872d84a71f07f ]---^M




> 
> Cheers,
> 
> [1] https://lists.xen.org/archives/html/xen-devel/2016-07/msg02571.htm
> 
> Nov 15 05:23:47.715172 [ 2156.529661] ------------[ cut here ]------------
> 
> Nov 15 05:24:04.483235 [ 2156.532899] kernel BUG at 
> drivers/xen/grant-table.c:770!
> 
> Nov 15 05:24:04.491191 [ 2156.538281] Internal error: Oops - BUG: 0 [#1] SMP 
> ARM
> 
> Nov 15 05:24:04.491233 [ 2156.543488] Modules linked in: xen_gntalloc 
> snd_soc_i2s snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine 
> snd_pcm wm8994_regulator snd_timer snd s5p_mfc wm8994 soundcore ac97_bus 
> videobuf2_dma_contig videobuf2_memops pwm_samsung videobuf2_v4l2 
> videobuf2_core v4l2_common videodev media s5p_sss usb3503 rtc_s3c dwc3 
> dwc3_exynos clk_s2mps11 s5m8767 dw_mmc_exynos dw_mmc_pltfm dw_mmc 
> phy_exynos5250_sata phy_exynos_usb2 ohci_exynos ehci_exynos phy_exynos5_usbdrd
> 
> Nov 15 05:24:04.531197 [ 2156.584721] CPU: 0 PID: 0 Comm: swapper/0 Not 
> tainted 4.9.20+ #1
> 
> Nov 15 05:24:04.539232 [ 2156.590793] Hardware name: SAMSUNG EXYNOS 
> (Flattened Device Tree)
> 
> Nov 15 05:24:04.547155 [ 2156.596957] task: c1207540 task.stack: c1200000
> 
> Nov 15 05:24:04.547179 [ 2156.601564] PC is at gnttab_batch_copy+0xd0/0xe4
> 
> Nov 15 05:24:04.555165 [ 2156.606246] LR is at gnttab_batch_copy+0x1c/0xe4
> 
> Nov 15 05:24:04.563145 [ 2156.610932] pc : [<c06dba3c>]    lr : [<c06db988>]  
>   psr: a0000113
> 
> Nov 15 05:24:04.563169 [ 2156.610932] sp : c1201d98  ip : deadbeef  fp : 
> e160c000
> 
> Nov 15 05:24:04.571111 [ 2156.622563] r10: c1201e90  r9 : 00000040  r8 : 
> 00000040
> 
> Nov 15 05:24:04.579105 [ 2156.627858] r7 : e160c000  r6 : 00000001  r5 : 
> e1610e00  r4 : e160e6d8
> 
> Nov 15 05:24:04.587154 [ 2156.634453] r3 : e160e6d8  r2 : deadbeef  r1 : 
> deadbeef  r0 : fffffff2
> 
> Nov 15 05:24:04.587189 [ 2156.641054] Flags: NzCv  IRQs on  FIQs on  Mode 
> SVC_32  ISA ARM  Segment none
> 
> Nov 15 05:24:04.595168 [ 2156.648256] Control: 10c5387d  Table: 7684006a  
> DAC: 00000051
> 
> Nov 15 05:24:04.603139 [ 2156.654072] Process swapper/0 (pid: 0, stack limit 
> = 0xc1200220)
> 
> Nov 15 05:24:04.611224 [ 2156.660147] Stack: (0xc1201d98 to 0xc1202000)
> 
> Nov 15 05:24:04.611279 [ 2156.664574] 1d80:                                   
>                     e160e6d8 00000000
> 
> Nov 15 05:24:04.619248 [ 2156.672824] 1da0: e1610e00 00000000 e160c000 
> c08b65a0 da3cb580 00000000 00000002 c039c1f8
> 
> Nov 15 05:24:04.627228 [ 2156.681080] 1dc0: c8c27db0 c039c85c 000f4240 
> 00000000 c1201df4 e160e6d8 0e29ffbb 000001f6
> 
> Nov 15 05:24:04.635141 [ 2156.689314] 1de0: f26f65f2 20000193 00000000 
> c8c27d48 00000008 00000000 00000001 0000005f
> 
> Nov 15 05:24:04.643124 [ 2156.697561] 1e00: c8c27dfc 60000193 c8c27d48 
> c090be6c 000f4240 00000000 00000000 c0372be4
> 
> Nov 15 05:24:04.651211 [ 2156.705807] 1e20: c03103bc c058eec4 0002d51e 
> c0752c50 c8c27d48 0f268479 00000001 e160c020
> 
> Nov 15 05:24:04.667186 [ 2156.714053] 1e40: e160c020 00000000 e160c000 
> 00000040 00000040 c1201e90 192b8000 c08b9124
> 
> Nov 15 05:24:04.675174 [ 2156.722298] 1e60: e160c020 00000001 0002d520 
> 0000012c c1202d00 c0a5586c 00000008 da3ce740
> 
> Nov 15 05:24:04.683182 [ 2156.730545] 1e80: c1116740 c131473e c1204f20 
> c1204f20 c1201e90 c1201e90 c1201e98 c1201e98
> 
> Nov 15 05:24:04.691270 [ 2156.738791] 1ea0: 00000000 00000000 00000003 
> c120208c c1200000 c1202080 00000100 c1202080
> 
> Nov 15 05:24:04.699158 [ 2156.747037] 1ec0: 40000003 c0348760 df003000 
> c11151a8 c1201ec8 c131b200 0000000a 0002d51f
> 
> Nov 15 05:24:04.707231 [ 2156.755282] 1ee0: c1202d00 00200100 d9808000 
> c1113e04 00000000 00000000 00000001 d9808000
> 
> Nov 15 05:24:04.715132 [ 2156.763529] 1f00: df003000 c11151a8 c12030a0 
> c0348b7c 00000095 c038aa74 c123f3c8 c1203440
> 
> Nov 15 05:24:04.723132 [ 2156.771785] 1f20: df00200c c1201f50 df002000 
> c0301754 c030928c c0309290 60000013 ffffffff
> 
> Nov 15 05:24:04.731168 [ 2156.780021] 1f40: c1201f84 00000000 c1200000 
> c030d10c 00000001 00000000 00000001 c031c520
> 
> Nov 15 05:24:04.739171 [ 2156.788266] 1f60: c1200000 c1203034 c1203098 
> 00000001 00000000 00000000 c11151a8 c12030a0
> 
> Nov 15 05:24:04.747165 [ 2156.796513] 1f80: 192b8000 c1201fa0 c030928c 
> c0309290 60000013 ffffffff 00000051 00000000
> 
> Nov 15 05:24:04.755233 [ 2156.804759] 1fa0: 00000000 c037d88c c1201fa8 
> c12387b1 00000000 ffffffff 00000000 c1000c5c
> 
> Nov 15 05:24:04.763183 [ 2156.813005] 1fc0: ffffffff ffffffff 00000000 
> c100068c 00000000 c10abe40 c1318ed4 c120301c
> 
> Nov 15 05:24:04.771237 [ 2156.821250] 1fe0: c10abe3c c12087e0 6020406a 
> 410fc0f4 00000000 6020807c 00000000 00000000
> 
> Nov 15 05:24:04.779267 [ 2156.829511] [<c06dba3c>] (gnttab_batch_copy) from 
> [<c08b65a0>] (xenvif_tx_action+0x80/0x738)
> 
> Nov 15 05:24:04.787225 [ 2156.838010] [<c08b65a0>] (xenvif_tx_action) from 
> [<c08b9124>] (xenvif_poll+0x28/0x64)
> 
> Nov 15 05:24:04.795188 [ 2156.845908] [<c08b9124>] (xenvif_poll) from 
> [<c0a5586c>] (net_rx_action+0x1e4/0x2d8)
> 
> Nov 15 05:24:04.803191 [ 2156.853717] [<c0a5586c>] (net_rx_action) from 
> [<c0348760>] (__do_softirq+0xfc/0x218)
> 
> Nov 15 05:24:04.811217 [ 2156.861528] [<c0348760>] (__do_softirq) from 
> [<c0348b7c>] (irq_exit+0xe4/0x140)
> 
> Nov 15 05:24:04.819157 [ 2156.868908] [<c0348b7c>] (irq_exit) from 
> [<c038aa74>] (__handle_domain_irq+0x60/0xb4)
> 
> Nov 15 05:24:04.827160 [ 2156.876806] [<c038aa74>] (__handle_domain_irq) from 
> [<c0301754>] (gic_handle_irq+0x48/0x8c)
> 
> Nov 15 05:24:04.835169 [ 2156.885223] [<c0301754>] (gic_handle_irq) from 
> [<c030d10c>] (__irq_svc+0x6c/0x90)
> 
> Nov 15 05:24:04.843205 [ 2156.892771] Exception stack(0xc1201f50 to 
> 0xc1201f98)
> 
> Nov 15 05:24:04.843240 [ 2156.897894] 1f40:                                   
>   00000001 00000000 00000001 c031c520
> 
> Nov 15 05:24:04.859140 [ 2156.906141] 1f60: c1200000 c1203034 c1203098 
> 00000001 00000000 00000000 c11151a8 c12030a0
> 
> Nov 15 05:24:04.867283 [ 2156.914387] 1f80: 192b8000 c1201fa0 c030928c 
> c0309290 60000013 ffffffff
> 
> Nov 15 05:24:04.867324 [ 2156.921078] [<c030d10c>] (__irq_svc) from 
> [<c0309290>] (arch_cpu_idle+0x38/0x3c)
> 
> Nov 15 05:24:04.875243 [ 2156.928542] [<c0309290>] (arch_cpu_idle) from 
> [<c037d88c>] (cpu_startup_entry+0x194/0x218)
> 
> Nov 15 05:24:04.883200 [ 2156.936874] [<c037d88c>] (cpu_startup_entry) from 
> [<c1000c5c>] (start_kernel+0x380/0x38c)
> 
> Nov 15 05:24:04.891277 [ 2156.945116] Code: e1c432b4 eaffffe0 e7f001f2 
> e8bd80f8 (e7f001f2)
> 
> Nov 15 05:24:04.899168 [ 2156.951298] ---[ end trace 766604e7ecb29bdc ]---
> 
> Nov 15 05:24:04.907166 [ 2156.955961] Kernel panic - not syncing: Fatal 
> exception in interrupt
> 
> Nov 15 05:24:04.915143 [ 2156.962415] CPU1: stopping
> 
> Nov 15 05:24:04.915174 [ 2156.965166] CPU: 1 PID: 0 Comm: swapper/1 Tainted: 
> G      D         4.9.20+ #1
> 
> Nov 15 05:24:04.923290 [ 2156.972453] Hardware name: SAMSUNG EXYNOS 
> (Flattened Device Tree)
> 
> Nov 15 05:24:04.931162 [ 2156.978631] [<c0310f94>] (unwind_backtrace) from 
> [<c030c574>] (show_stack+0x10/0x14)
> 
> Nov 15 05:24:04.939192 [ 2156.986436] [<c030c574>] (show_stack) from 
> [<c0591068>] (dump_stack+0x98/0xac)
> 
> Nov 15 05:24:04.939219 [ 2156.993729] [<c0591068>] (dump_stack) from 
> [<c030f688>] (handle_IPI+0x174/0x194)
> 
> Nov 15 05:24:04.947171 [ 2157.001192] [<c030f688>] (handle_IPI) from 
> [<c0301794>] (gic_handle_irq+0x88/0x8c)
> 
> Nov 15 05:24:04.955151 [ 2157.008829] [<c0301794>] (gic_handle_irq) from 
> [<c030d10c>] (__irq_svc+0x6c/0x90)
> 
> Nov 15 05:24:04.963147 [ 2157.016376] Exception stack(0xd98dbf88 to 
> 0xd98dbfd0)
> 
> Nov 15 05:24:04.971150 [ 2157.021597] bf80:                   00000001 
> 00000000 00000001 c031c520 d98da000 c1203034
> 
> Nov 15 05:24:04.979206 [ 2157.029850] bfa0: c1203098 00000002 00000000 
> 00000000 c11151a8 c12030a0 192c6000 d98dbfd8
> 
> Nov 15 05:24:04.987142 [ 2157.038093] bfc0: c030928c c0309290 600f0013 
> ffffffff
> 
> Nov 15 05:24:04.995142 [ 2157.043118] [<c030d10c>] (__irq_svc) from 
> [<c0309290>] (arch_cpu_idle+0x38/0x3c)
> 
> Nov 15 05:24:05.003148 [ 2157.050584] [<c0309290>] (arch_cpu_idle) from 
> [<c037d88c>] (cpu_startup_entry+0x194/0x218)
> 
> Nov 15 05:24:05.011218 [ 2157.058913] [<c037d88c>] (cpu_startup_entry) from 
> [<60301c4c>] (0x60301c4c)
> 
> Nov 15 05:24:05.011257 [ 2157.066095] ---[ end Kernel panic - not syncing: 
> Fatal exception in interrupt
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.