xen-devel
Re: [Xen-devel] OOM problems
Performance is noticeably lower with aio on these bursty write
workloads; I've been getting a number of complaints.
I see that 2.6.36 has some page_writeback changes:
http://www.kernel.org/diff/diffview.cgi?file=%2Fpub%2Flinux%2Fkernel%2Fv2.6%2Fpatch-2.6.36.bz2;z=8379
. Any thoughts on whether these would make a difference for the problems
with "file:"? I'm still trying to find a way to reproduce the issue in
the lab, so I'd have to test the patch in production -- that's not a
tantalizing prospect, unless there is a real chance that it will affect it.
-John
On 11/15/2010 9:59 AM, John Weekes wrote:
They are throttled, but the single control I'm aware of
is /proc/sys/vm/dirty_ratio (or dirty_bytes, nowadays). Which is only
per process, not a global limit. Could well be that's part of the
problem -- outwitting mm with just too many writers on too many cores?
We had a bit of trouble when switching dom0 to 2.6.32, buffered writes
made it much easier than with e.g. 2.6.27 to drive everybody else into
costly reclaims.
The Oom shown here reports about ~650M in dirty pages. The fact alone
that this counts as on oom condition doesn't sound quite right in
itself. That qemu might just have dared to ask at the wrong point in
time.
Just to get an idea -- how many guests did this box carry?
It carries about two dozen guests, with a mix of mostly HVMs (all
stubdom-based, some with PV-on-HVM drivers) and some PV.
This problem occurred more often for me under 2.6.32 than 2.6.31, I
noticed. Since I made the switch to aio, I haven't seen a crash, but
it hasn't been long enough for that to mean much.
Having extra caching in the dom0 is nice because it allows for domUs
to get away with having small amounts of free memory, while still
having very good (much faster than hardware) write performance. If you
have a large number of domUs that are all memory-constrained and use
the disk in infrequent, large bursts, this can work out pretty well,
since the big communal pool provides a better value proposition than
giving each domU a few more megabytes of RAM.
If the OOM problem isn't something that can be fixed, it might be a
good idea to print out a warning to the user when a domain using
"file:" is started. Or, to go a step further and automatically run
"file" based domains as though "aio" was specified, possibly with a
warning and a way to override that behavior. It's not really intuitive
that "file" would cause crashes.
-John
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- RE: [Xen-devel] OOM problems, (continued)
- RE: [Xen-devel] OOM problems, Ian Pratt
- Re: [Xen-devel] OOM problems, John Weekes
- RE: [Xen-devel] OOM problems, Jan Beulich
- RE: [Xen-devel] OOM problems, Daniel Stodden
- RE: [Xen-devel] OOM problems, Jan Beulich
- Re: [Xen-devel] OOM problems, John Weekes
- Re: [Xen-devel] OOM problems,
John Weekes <=
- RE: [Xen-devel] OOM problems, Ian Pratt
- Re: [Xen-devel] OOM problems, John Weekes
- RE: [Xen-devel] OOM problems, Ian Pratt
- Re: [Xen-devel] OOM problems, Daniel Stodden
- Re: [Xen-devel] OOM problems, John Weekes
- Re: [Xen-devel] OOM problems, Daniel Stodden
- Re: [Xen-devel] OOM problems, John Weekes
- Re: [Xen-devel] OOM problems, Daniel Stodden
- Re: [Xen-devel] OOM problems, John Weekes
- RE: [Xen-devel] OOM problems, Stefano Stabellini
|
|
|