[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] VIF rate limiting locks up network in the whole system



On 05/09/14 12:32, Ian Campbell wrote:
> On Fri, 2014-05-09 at 12:25 +0200, Jacek Konieczny wrote:
> 
>>> Do they perhaps differ between the working and non-working case
>>> (despite the input configuration being the same)?
>>
>> I will check that. I think this can be safely done even on a production
>> server, still running the old Xen and kernel.
> 
> Just to be clear I meant working with rate= (on the old setup) and not
> working without rate= (on the new setup). Working without rate= won't
> tell us much, since those keys simply won't be present..

Yes, I understood that.

I used the same xl configuration file on two hosts.

Xen 4.4.0, Linux 3.13.6 (not-working setup):

/local/domain/0/backend/vif/24/0/frontend =
"/local/domain/24/device/vif/0"   (n0,r24)
/local/domain/0/backend/vif/24/0/frontend-id = "24"   (n0,r24)
/local/domain/0/backend/vif/24/0/online = "1"   (n0,r24)
/local/domain/0/backend/vif/24/0/state = "4"   (n0,r24)
/local/domain/0/backend/vif/24/0/script = "/etc/xen/scripts/vif-bridge"
  (n0,r24)
/local/domain/0/backend/vif/24/0/mac = "02:00:0f:ff:00:1f"   (n0,r24)
/local/domain/0/backend/vif/24/0/rate = "800,50000"   (n0,r24)
/local/domain/0/backend/vif/24/0/bridge = "xenbr0"   (n0,r24)
/local/domain/0/backend/vif/24/0/handle = "0"   (n0,r24)
/local/domain/0/backend/vif/24/0/type = "vif"   (n0,r24)
/local/domain/0/backend/vif/24/0/feature-sg = "1"   (n0,r24)
/local/domain/0/backend/vif/24/0/feature-gso-tcpv4 = "1"   (n0,r24)
/local/domain/0/backend/vif/24/0/feature-gso-tcpv6 = "1"   (n0,r24)
/local/domain/0/backend/vif/24/0/feature-ipv6-csum-offload = "1"   (n0,r24)
/local/domain/0/backend/vif/24/0/feature-rx-copy = "1"   (n0,r24)
/local/domain/0/backend/vif/24/0/feature-rx-flip = "0"   (n0,r24)
/local/domain/0/backend/vif/24/0/feature-split-event-channels = "1"
(n0,r24)
/local/domain/0/backend/vif/24/0/hotplug-status = "connected"   (n0,r24)

Xen 4.2.1, kernel 3.7.1 (old working setup, I don't have 4.3 and newer
kernel handy):

/local/domain/0/backend/vif/20/0/frontend =
"/local/domain/20/device/vif/0"   (n0,r20)
/local/domain/0/backend/vif/20/0/frontend-id = "20"   (n0,r20)
/local/domain/0/backend/vif/20/0/online = "1"   (n0,r20)
/local/domain/0/backend/vif/20/0/state = "4"   (n0,r20)
/local/domain/0/backend/vif/20/0/script = "/etc/xen/scripts/vif-bridge"
  (n0,r20)
/local/domain/0/backend/vif/20/0/mac = "02:00:0d:ff:00:1f"   (n0,r20)
/local/domain/0/backend/vif/20/0/rate = "800,50000"   (n0,r20)
/local/domain/0/backend/vif/20/0/bridge = "br1"   (n0,r20)
/local/domain/0/backend/vif/20/0/handle = "0"   (n0,r20)
/local/domain/0/backend/vif/20/0/type = "vif"   (n0,r20)
/local/domain/0/backend/vif/20/0/feature-sg = "1"   (n0,r20)
/local/domain/0/backend/vif/20/0/feature-gso-tcpv4 = "1"   (n0,r20)
/local/domain/0/backend/vif/20/0/feature-rx-copy = "1"   (n0,r20)
/local/domain/0/backend/vif/20/0/feature-rx-flip = "0"   (n0,r20)
/local/domain/0/backend/vif/20/0/hotplug-status = "connected"   (n0,r20)

No change in the 'rate' value here, but the 'features' are different.

>>> Those keys then affect netback's behaviour which is why I am interested
>>> in whether the kernel version has changed.
>>
> I think having confirmed that the xenstore keys are unchanged then the
> kernel side should be the focus.

I have booted the Xen 4.4.0 host with an older kernel: 3.7.10

The system does not lock up any more.

Xenstore variables for the backend:

/local/domain/1/device/vif/0/backend = "/local/domain/0/backend/vif/1/0"
  (n1,r0)
/local/domain/1/device/vif/0/backend-id = "0"   (n1,r0)
/local/domain/1/device/vif/0/state = "4"   (n1,r0)
/local/domain/1/device/vif/0/handle = "0"   (n1,r0)
/local/domain/1/device/vif/0/mac = "02:00:0f:ff:00:1f"   (n1,r0)
/local/domain/1/device/vif/0/tx-ring-ref = "9"   (n1,r0)
/local/domain/1/device/vif/0/rx-ring-ref = "768"   (n1,r0)
/local/domain/1/device/vif/0/event-channel = "11"   (n1,r0)
/local/domain/1/device/vif/0/request-rx-copy = "1"   (n1,r0)
/local/domain/1/device/vif/0/feature-rx-notify = "1"   (n1,r0)
/local/domain/1/device/vif/0/feature-sg = "1"   (n1,r0)
/local/domain/1/device/vif/0/feature-gso-tcpv4 = "1"   (n1,r0)
/local/domain/1/device/vif/0/feature-gso-tcpv6 = "1"   (n1,r0)
/local/domain/1/device/vif/0/feature-ipv6-csum-offload = "1"   (n1,r0)

Is it possible, that one of the features introduced by the 3.13 kernel is
faulty (e.g. the 'feature-split-event-channels')?

Is there a way to selectively enable/disable those features without changing
the kernel?

I will also try the 3.14.3 kernel, but I need to prepare it first.

Greets,
        Jacek

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.