[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fix VGA logdirty related display freezes with altp2m



On 10/25/18 11:11 PM, Tamas K Lengyel wrote:
> On Thu, Oct 25, 2018 at 9:08 AM Tamas K Lengyel
> <tamas.k.lengyel@xxxxxxxxx> wrote:
>>
>> On Thu, Oct 25, 2018 at 9:02 AM Razvan Cojocaru
>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>
>>> On 10/25/18 5:55 PM, Tamas K Lengyel wrote:
>>>> On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru
>>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>> On 10/24/18 8:52 PM, Tamas K Lengyel wrote:
>>>>>> On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel
>>>>>> <tamas.k.lengyel@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru
>>>>>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote:
>>>>>>>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru
>>>>>>>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>> Tamas, could you please give this a spin?
>>>>>>>>>>
>>>>>>>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2
>>>>>>>>>>
>>>>>>>>>> It _should_ solve the crashes.
>>>>>>>>>
>>>>>>>>> Indeed, I no longer see the crash. However, there might be some
>>>>>>>>> locking issue present because the whole system freezes up shortly
>>>>>>>>> after starting DRAKVUF on a domain - within a couple seconds. I mean
>>>>>>>>> Xen itself locks up: no response on the serial, dom0 screen frozen,
>>>>>>>>> etc.
>>>>>>>>
>>>>>>>> Do you have any type of log / backtrace / way I could reproduce it
>>>>>>>> without Drakvuf? All the ways I've tested it were fine (including
>>>>>>>> xen-access).
>>>>>>>
>>>>>>> I don't have a standalone test that produces that error. With DRAKVUF
>>>>>>> it is easily reproducible though. If you have a Windows guest
>>>>>>> installed, setting up DRAKVUF should really not be much trouble. With
>>>>>>> xen-access it indeed doesn't lock up but since the guest is pretty
>>>>>>> much unresponsive during that test I can't verify whether the VGA
>>>>>>> issue is now resolved or not. Also the xen-access tests are fairly
>>>>>>> limited and don't use all aspects of altp2m.
>>>>>>>
>>>>>>
>>>>>> What I see from the DRAKVUF log is that the last thing it prints is
>>>>>> sending a vm_event response that both enables singlestepping and
>>>>>> switches altp2m view. This looks to be consistent. It didn't matter if
>>>>>> the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's
>>>>>> definitely racey because it doesn't happen right away, the system
>>>>>> works as expected for a couple seconds.
>>>>>
>>>>> After having to install clang because my GCC couldn't build Drakvuf:
>>>>>
>>>>> ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial
>>>>> designated initializers not supported
>>>>
>>>> Please follow the instruction for compiling it, clang is a
>>>> requirement. I don't even know how you got pass the ./configure stage
>>>> without clang being installed. You could also just copy-paste things
>>>> from the travis script directly:
>>>> https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51
>>>>
>>>>>
>>>>> then rekall via pip, then having to mount my Windows disk to do "rekal
>>>>> peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the
>>>>> debug files on the Microsoft server. :)
>>>>
>>>> If your version if Windows is that brand new then yes, Microsoft takes
>>>> a couple days to publish their debug information and you will just
>>>> have to wait or use an older version of Windows.
>>>>
>>>>>
>>>>> So if you could find a way to reproduce the issue with a simple
>>>>> libxc-based application alone (or at least with something
>>>>> libvmi-related, which I do have set up), I'd really appreciate it.
>>>>>
>>>>> Or maybe try to hack around with patch no 3 of the series (for a start,
>>>>> just revert it and see if the problem persists - of course the display
>>>>> will freeze) and see if there's an easy fix?
>>>>
>>>> Unfortunately I won't have time to do either of these any time soon.
>>>> If you are having that much trouble setting it up I can perhaps send
>>>> you a pre-compiled version with a version of Windows for which
>>>> Microsoft already published the debug info for.
>>>
>>> It's a Windows 7 x64 guest. But the problem was that the right command
>>> line is:
>>>
>>> rekall fetch_pdb ntkrnlmp
>>>
>>> instead of the suggested "rekall fetch_pdb ntkrpamp" on the drakvuf.com
>>> website.
>>
>> The kernel filename is specific to the version of Windows you have
>> installed. The instructions specify _an example_ for the 32-bit
>> version of Windows 7 and you will need to adjust it according to the
>> kernel filename. For 64-bit it is ntkrnlmp. The instruction explicitly
>> say that you need to use the PDB filename that was printed for your
>> specific kernel version.
>>
>>>
>>> I'll try to continue - in any case should I have more trouble I'll
>>> contact you privately so as not to spam the list. Just wanted to leave
>>> this here in case someone else has this problem in the hope that it's
>>> useful.
>>
>> Of course, also please feel free to open an issue on github if you run
>> into something that's blocking you. Chances are if you run into it,
>> others would too :)
> 
> We can chalk the freeze issue up to buggy hardware on my side. We
> couldn't reproduce the issue on two other systems. The screen issue is
> definitely gone now which is awesome! :) Thanks Razvan!

No problem, thank you for testing! And I'm competent with Drakvuf now. :)


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.