Pat Campbell <plc@xxxxxxxxxx> writes:
> Markus Armbruster wrote:
>> Pat Campbell <plc@xxxxxxxxxx> writes:
>>
>>
>>> Markus Armbruster wrote:
>>>
>>>> Pat Campbell <plc@xxxxxxxxxx> writes:
>>>>
>>>>
>>>>
>>>>> Attached patch adds dynamic frame buffer resolution support to
>>>>> the PV xenfb frame buffer driver.
>>>>>
>>>>> Corresponding backend IOEMU patch is required for functionality but
>>>>> this patch is not dependent on it, preserving backwards compatibility.
>>>>>
>>>>> Please apply to tip of linux-2.6.18-xen
>>>>>
>>>>> Signed-off-by: Pat Campbell <plc@xxxxxxxxxx>
>>>>>
>>>>>
>>>> Adding another lock (queue_lock) to our already ticklish locking gives
>>>> me a queasy feeling...
>>>>
>>>> The purpose of the lock is not obvious to me. Please explain that, in
>>>> the code. Likewise, it's not entirely obvious that the locking works.
>>>> Please update the "How the locks work together" comment.
>>>>
>>>> But before you do that, I suggest you tell us exactly what problem
>>>> you're attempting to solve with queue_lock. Maybe we can come up with
>>>> a less scary solution. Maybe not. But it's worth a try. If it's
>>>> just to ensure the changes made by xenfb_set_par() are seen in
>>>> xenfb_do_resize() correctly, a memory barrier should to the trick more
>>>> easily.
>>>>
[Pat explains the race...]
>>
>> Your race is real.
>>
>> Your old code shared the resolution state (info->resize_dpy and the
>> resolution variables in info->fb_info) between xenfb_do_resize()
>> (running in xenfb_thread) and xenfb_set_par() (running in another
>> thread).
>>
>> Your new code shares the ring buffer instead.
>>
>> Both methods can be made race-free with suitable synchronization.
>>
>> With your old code, xenfb_set_par() always succeeds. Until the
>> backend gets around to process the XENFB_TYPE_RESIZE message, it
>> interprets the frame buffer for the old resolution. As you argue
>> below, this isn't fatal, it can just mess up the display somewhat.
>>
>> With your new code, xenfb_set_par() can take a long time (one second),
>> and it can fail, leaving the display in a deranged state until the
>> next resolution change. It still doesn't fully synchronize with the
>> backend: it returns successfully after sending the message, leaving
>> the time window until the backend receives the message open.
>>
> Synchronizing with the backend? I don't think this is necessary. The
> update events that follow the resize event will be for the new size, as
> long as the backend gets the resize event we should be ok.
I didn't mean to say that synchronizing with the backend is necessary.
I just wanted to point out that your new method pays the price of
synchronization, namely a possibly significant delay in
xenfb_set_par(), plus an ugly failure mode there, without actually
achieving synchronization.
>> I'd prefer the old behavior. But I could be missing something here.
>> Am I?
>>
>>
> I like the new behavior, it seems more straight forward and does not
> rely on the xenfb_thread() being active. Our first set_par() happens
> during frame buffer registration which is before xenfb_thread() is
> started. I disliked adding new code into the xenfb_update() function
> for an event that happens rarely so will revert back to the sharing of
> resize_dpy flag code.
>
> Is this how you envision xenfb_set_par() synchronization?
>
> static int xenfb_set_par(struct fb_info *info)
> {
> struct xenfb_info *xenfb_info;
> unsigned long flags;
>
> xenfb_info = info->par;
>
> if (xenfb_info->kthread) {
> spin_lock_irqsave(&xenfb_info->resize_lock, flags);
> info->var.xres_virtual = info->var.xres;
> info->var.yres_virtual = info->var.yres;
> xenfb_info->resize_dpy = 1;
> /* Wake up xenfb_thread() */
> xenfb_refresh(xenfb_info, 0, 0, 1, 1);
> while (xenfb_info->kthread && xenfb_info->resize_dpy == 1) {
> msleep(10);
> }
> spin_unlock_irqrestore(&xenfb_info->resize_lock, flags);
> }
> return 0;
> }
>
> To explain the code a liitle bit.
> 1. Need to test for kthread because first xenfb_set_par() happens
> before xenfb_thread() is started
Why can you just ignore the mode change then?
> 2. Need to wakeup xenfb_thread via refresh as application screen events
> are blocked waiting on the set_par to complete. Can't just set dirty,
> will get a bogus nasty gram.
I'd pass an empty rectangle there:
xenfb_refresh(xenfb_info, INT_MAX, INT_MAX, 0, 0);
Alternatively, factor out the code to wake up the thread into a new
function, and call that here and from __xenfb_refresh(). That would
be cleaner.
> 3. Testing xenfb_thread in the while loop in case xenfb_remove() is
> called and thread is shutdown. Protects against a system shutdown
> hang. Don't know if that can happen, defensive code.
Sleeps while holding a spinlock. Not good. And I didn't get the hang
you're trying to defend against.
xenfb_resize_screen() still accesses the size unsynchronized, and can
thus see intermediate states of resolution changes.
No, that's not what I had in mind. Let me sketch it.
The first question to answer for a mutex is: what shared state does it
protect? resize_lock protects the screen size fb_info->var.xres,
fb_info->var.yres and the flag resize_dpy.
Once that's clear, the use of the mutex becomes obvious: wrap it
around any access of the shared state, i.e. the updating of the shared
state it protects in xenfb_set_par() and the reading in
xenfb_thread(). Code sketch:
static void xenfb_resize_screen(struct xenfb_info *info)
{
if (xenfb_queue_full(info))
return;
/* caller holds info->resize_lock */
info->resize_dpy = 0;
xenfb_do_resize(info);
}
static int xenfb_thread(void *data)
{
struct xenfb_info *info = data;
unsigned long flags;
while (!kthread_should_stop()) {
spin_lock_irqsave(info->resize_lock, flags);
if (info->resize_dpy)
xenfb_resize_screen(info);
spin_unlock_irqrestore(info->resize_lock, flags);
[...]
}
[...]
static int xenfb_set_par(struct fb_info *info)
{
struct xenfb_info *xenfb_info = info->par;
unsigned long flags;
spin_lock_irqsave(info->resize_lock, flags);
info->var.xres_virtual = info->var.xres;
info->var.yres_virtual = info->var.yres;
xenfb_info->resize_dpy = 1;
spin_unlock_irqrestore(info->resize_lock, flags);
if (xenfb_info->kthread) {
/* Wake up xenfb_thread() */
xenfb_refresh(xenfb_info, INT_MAX, INT_MAX, 0, 0);
}
return 0;
}
Note that xenfb_set_par() can schedule resolution changes just fine
before xenfb_thread() runs. It'll pick them up right when it starts.
I used xenfb_refresh() to wake up xenfb_thread() only to keep things
simple, not to express a preference for that method.
Don't just copy my code sketch! Review it carefully, please.
>
>
> -------------------
> New lock comment
>
> /*
> * There are three locks: spinlock resize_lock protecting resize_dpy,
It actually protects the screen size and resize_dpy.
> * spinlock dirty_lock protecting the dirty rectangle and mutex
> * mm_lock protecting mappings.
> *
> * resize_lock is used to synchronize resize_dpy access between
> * xenfb_set_par() and xenfb_do_resize() running in xenfb_thread().
> *
> * How the dirty and mapping locks work together
> *
> ..........
>
>
>> I think you could fix the old code by protecting access to the shared
>> resolution state by a spin lock.
>>
>> If I am missing something, and your new code is the way to go, you
>> still need to migrate your explanation why it works from e-mail to
>> source file, where the poor maintainer can find it.
>>
>> [...]
>>
>> Looks like this is the last issue that you haven't addressed /
>> explained fully yet :)
>>
>>
> Yahoo, getting close. Thanks
>
> Just an FYI, I am at Brainshare this week so my responses might be a
> little slow but I will try to turn around any comments I receive as soon
> as possible. Like to put this to bed before any more staging changes
> take place.
Me too! And thanks for pushing this feature all the way.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|