xen-devel

[Top] [All Lists]

Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/

from [Daniel Stodden]

[Permanent Link][Original]

To:	Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
Subject:	Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O
From:	Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Date:	Mon, 20 Jun 2011 13:47:27 -0700
Cc:	Xen <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date:	Mon, 20 Jun 2011 13:49:47 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<19967.31281.194700.653274@xxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<patchbomb.1308558389@xxxxxxxxxxxxxxxxxxxxxxx> <f73f9c3d6eaeac7a77c9.1308558390@xxxxxxxxxxxxxxxxxxxxxxx> <19967.31281.194700.653274@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

On Mon, 2011-06-20 at 12:49 -0400, Ian Jackson wrote:
> Daniel Stodden writes ("[Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with 
> broken frontend/backend ring I/O"):
> > Adds tool support to debug backends which expose I/O ring state in
> > sysfs. Currently supports /sys/devices/xen-backend/vbd-*-*/io_ring
> > nodes for block I/O, where implemented.
> 
> Thanks.
> 
> > Primary function is to observe ring state make progress over a period
> > of time, then report stuck message queue halves where pending
> > consumer/event are not moving.
> 
> This seems to have only one entry in COMMANDS, "check".  Is that
> right ?  

The <command> thing should allow alternative ways to run it without
breaking existing deployments. I used to think about a 'daemon', but
then found that cron would likely do the job.

> And it doesn't seem to provide a way to specify a particular
> domain to look for ?

I briefly considered it initially, but after testing it just didn't look
so important anymore. :}

Presently, a 

# xen-ringwatch check -v 
RingWatch(vbd-1-51760/io_ring)[IDLE]: RingState(size=32, Req(prod=31, cons=31, 
event=32), Rsp(prod=31, pvt=31, event=32)): io: complete, req: complete, rsp: 
complete
RingWatch(vbd-1-51712/io_ring)[BUSY]: RingState(size=32, Req(prod=143236466, 
cons=143236466, event=143236467), Rsp(prod=143236459, pvt=143236459, 
event=143236460)): io: pending, req: complete, rsp: complete

will to dump the entire set of running backends, independent of state.

I should point out there's not really a significant overhead involved,
except some required wait period to come to a conclusion. It's all
glob/read/write/wait and all VBDs are watched in parallel. But even with
50 VMs, at some point I anticipated people to rather grep instead.

Here's a sample crontab invocation:

xen-ringwatch check -T 4 --kick | logger -p daemon.crit -t RINGWATCH-ALERT

Which will remain silent, until it actually discovers some watched
subset to .kick() and then outputs those, exclusively.

Jun 20 13:26:59 localhost RINGWATCH-ALERT: 
RingWatch(vbd-1-51712/io_ring)[STCK]: RingState(size=32, Req(prod=146141561, 
cons=146141561, event=146141562), Rsp(prod=146141561, pvt=146141561, 
event=146141530)): io: complete, req: complete, rsp: pending

> I'm happy to take it as-is as it seems like a better-than-nothing tool
> but I just wanted to check I'd understood it, first.

Found that the patch I sent was missing cleanup in some spots (mainly a
program rename, and the verbose variable in __main__ ended up off by
one). Can I sneak in the update attached before you push it?

Also, I never tried the make install target. Does it look okay to you?

Cheers,
Daniel

xen-ringwatch.diff
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] [PATCH 0 of 1] Deal with broken frontend/backend ring I/O., Daniel Stodden [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O, Daniel Stodden Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O, Ian Jackson Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O, Daniel Stodden <= Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O, Ian Jackson

Previous by Date:	[Xen-devel] [PATCH net-next 4/5] xen: convert to 64 bit stats interface, Stephen Hemminger
Next by Date:	Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables, Konrad Rzeszutek Wilk
Previous by Thread:	Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O, Ian Jackson
Next by Thread:	Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O, Ian Jackson
Indexes:	[Date] [Thread] [Top] [All Lists]