[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] DomU's network interface will hung when Dom0 running 32bit

On 2013-10-17 0:11, David Vrabel wrote:
On 16/10/13 16:17, Wei Liu wrote:
On Wed, Oct 16, 2013 at 11:04:34PM +0800, jianhai luan wrote:
>From ef02403a10173896c5c102f768741d0700b8a3a2 Mon Sep 17 00:00:00 2001
From: Jason Luan <jianhai.luan@xxxxxxxxxx>
Date: Tue, 15 Oct 2013 17:07:49 +0800
Subject: [PATCH] xen-netback: pending timer only in the range [expire,

The function time_after_eq() do correct judge in range of MAX_UNLONG/2.
If net-front send lesser package, the delta between now and next_credit
will out of the range and time_after_eq() will do wrong judge in result
to net-front hung.  For example:
     expire    next_credit    ....    next_credit+MAX_UNLONG/2    now
     -----------------time increases this direction----------------->

We should be add the environment which now beyond next_credit+MAX_UNLONG/2.
Because the fact now mustn't before expire, time_before(now, expire) == true
will show the environment.
     time_after_eq(now, next_credit) || time_before (now, expire)
     !time_in_range_open(now, expire, next_credit)

I would like the description improved because it's too hard to understand.

How about something like:

"time_after_eq() only works if the delta is < MAX_ULONG/2.

If netfront sends at a very low rate, the time between subsequent calls
to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for
timer_after_eq() will be incorrect.  Credit will not be replenished and
the guest may become unable to send (e.g., if prior to the long gap, all
credit was exhausted)."

Thanks your description, i will accept it. :)

But that's as far as I get because I can't see how the fix is correct.
The time_in_range() test might still return the wrong value if now has
advanced even further and wrapped so it is between expire and
next_credit again.

typo, time_in_range() should be time_in_range_open().
Yes, if now have advanced even further and wrapped, it will always fall in [ expire, next_credit). In the range, please think two scenario: * No transmit limit: expire == next_credit, the range will be zero, replenish will always be done. * Transmit limit: Because guest may be consume all credit_bytes in very short time, other time in [expire, next_credit) will don't send any package. So the time which don't send package should be think about when we set the rate parameter. So if now fall in the range, the hung time should be acceptable. (if rate=10000M/s, the worse time will be 4s).

I think the credit timeout should be always armed to expire in
MAX_ULONG/4 jiffies (or some other large value).  If credit is exceeded,
this timer is then adjusted to fire earlier (at next_credit as it does

Setting timer may be fixed the issue. But i don't think how to verify the fixed expect waiting 180 days. I verified the above patch only change expire's value to emulator the scenario.



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.