[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 00/25 v7] SBSA UART emulation support in Xen





On 09/08/17 11:58, Bhupinder Thakur wrote:
Hi Julien,

Hi Bhupinder,

Thanks for the testing.

On 8 August 2017 at 21:29, Julien Grall <julien.grall@xxxxxxx> wrote:
Hi Bhupinder,

I gave another and I have a couple of comments.

Booting Linux with earlycon enabled take quite a while. I can see the
characters coming slower than on the minitel. It seems to be a bit better
after switching off the bootconsole. Overall Linux is taking ~20 times to
boot with pl011 vs HVC console.

I do agree that pl011 is emulated and therefore you have to trap after each
character. But 20 times sounds far too much.

I think this slowness could be due to ratelimiting of the pl011 events
in xenconosle. Currently, the rate limit is
set to 30 events per 200 msecs (see RATE_LIMIT_ALLOWANCE/RATE_LIMIT_PERIOD).

I increased the rate limit to 600 events (30 * 20) per 200 msecs. With
this change,
I see that the the find command is running faster and smoother.
Earlier the find output would be jerky.

I think there might be another solution avoiding increasing the rate limit.

If you look at the earlycon code for pl011 in Linux:

static void pl011_putc(struct uart_port *port, int c)
{
        while (readl(port->membase + UART01x_FR) & UART01x_FR_TXFF)
                cpu_relax();
        if (port->iotype == UPIO_MEM32)
                writel(c, port->membase + UART01x_DR);
        else
                writeb(c, port->membase + UART01x_DR);
        while (readl(port->membase + UART01x_FR) & UART01x_FR_BUSY)
                cpu_relax();
}

Linux will wait the UART to be idle before sending a new character.

Now looking at vpl011 emulation, the busy bit set when a new character is queued (see vpl011_write_data). This bit will only be cleared when the console daemon will raise an event and the queue is empty (see vpl011_data_avail).

This means for earlycon, you will need a round trip Guest -> Xen -> Dom0 -> Xen -> Guest for each single character. This is a bit counterproductive and combined with the limit it makes it worse.

I would take a different approach on the BUSY bit. We can consider the queue between Xen and xenconsoled as outside of the UART. If the character is queued, then job done. I think this would improve quite a lot of the performance.

Also, I would append a new patch at then end of the series rather modify patch #1. This would avoid to do more review :).


After that I tried to stress the emulation a bit with "find ." to get a lot
of output. And I noticed a lot of message similar to the one below on xen
console:

d6v0 vpl011: Unexpected OUT ring buffer full

Associated to that the character have been eaten resulting to non-sense log.

A bit above the printk printing this message, there are a comment saying:

    /*
     * It is expected that the ring is not full when this function is called
     * as the guest is expected to write to the data register only when the
     * TXFF flag is not set.
     * In case the guest does write even when the TXFF flag is set then the
     * data will be silently dropped.
     */

I am quite surprised that Linux is not looking at the TXFF flags. So this
needs some investigation.

I ran 'find' but could not reproduce the issue.

Sorry I forgot to precise that you need to run find in a directory with a lot of files. A good solution would be:

find /

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.