[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] DomU's network interface will hung when Dom0 running 32bit

Hi Ian,
I meet the DomU's network interface hung issue recently, and have been working on the issue from that time. I find that DomU's network interface, which send lesser package, will hung if Dom0 running 32bit and DomU's up-time is very long. I think that one jiffies overflow bug exist in the function tx_credit_exceeded(). I know the inline function time_after_eq(a,b) will process jiffies overflow, but the function have one limit a should little that (b + MAX_SIGNAL_LONG). If a large than the value, time_after_eq will return false. The MAX_SINGNAL_LONG should be 0x7fffffff at 32-bit machine. If DomU's network interface send lesser package (<0.5k/s if jiffies=250 and credit_bytes=ULONG_MAX), jiffies will beyond out (credit_timeout.expires + MAX_SIGNAL_LONG) and time_after_eq(now, next_credit) will failure (should be true). So one timer which will not be trigger in short time, and later process will be aborted when timer_pending(&vif->credit_timeout) is true. The result will be DomU's network interface will be hung in long time (> 40days).
  Please think about the below scenario:
    Dom0 running 32-bit and HZ = 1000
vif->credit_timeout->expire = 0xffffffff, vif->remaining_credit = 0xffffffff, vif->credit_usec=0 jiffies=0 vif receive lesser package (DomU send lesser package). If the value is litter than 2K/s, consume 4G(0xffffffff) will need 582.55 hours. jiffies will large than 0x7ffffff. we guess jiffies = 0x800000ff, time_after_eq(0x800000ff, 0xffffffff) will failure, and one time which expire is 0xfffffff will be pended into system. So the interface will hung until jiffies recount 0xffffffff (that will need very long time).

  If some error exist in above explain, please help me point it out.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.