xen-devel
Re: [Xen-devel] OOM problems
Daniel:
> Which branch/revision does latest pvops mean?
stable-2.6.32, using the latest pull as of today. (I also tried
next-2.6.37, but it wouldn't boot for me.)
Would you be willing to try and reproduce that again with the XCP blktap
(userspace, not kernel) sources? Just to further isolate the problem.
Those see a lot of testing. I certainly can't come up with a single fix
to the aio layer, in ages. But I'm never sure about other stuff
potentially broken in userland.
I'll have to give it a try. Normal blktap still isn't working with
pv_ops, though, so I hope this is a drop-in for blktap2.
In my last bit of troubleshooting, I took O_DIRECT out of the open call
in tools/blktap2/drivers/block-aio.c, and preliminary testing indicates
that this might have eliminated the problem with corruption. I'm testing
further now, but could there be an issue with alignment (since the
kernel is apparently very strict about it with direct I/O)? (Removing
this flag also brings back in use of the page cache, of course.)
If dio is definitely not what you feel you need, let's get back your
original OOM problem. Did reducing dom0 vcpus help? 24 of them is quite
aggressive, to say the least.
When I switched to aio, I reduced the vcpus to 2 (I needed to do this
with dom0_max_vcpus, rather than through xend-config.sxp -- the latter
wouldn't always boot). I haven't separately tried cached I/O with
reduced CPUs yet, except in the lab; and unfortunately I still can't get
the problem to happen in the lab, no matter what I try.
If that alone doesn't help, I'd definitely try and check vm.dirty_ratio.
There must be a tradeoff which doesn't imply scribbling the better half
of 1.5GB main memory.
The default for dirty_ratio is 20. I tried halving that to 10, but it
didn't help. I could try lower, but I like the thought of keeping this
in user space, if possible, so I've been pursuing the blktap2 path most
aggressively.
Ian:
That's disturbing. It might be worth trying to drop the number of VCPUs in
dom0 to 1 and then try to repro.
BTW: for production use I'd currently be strongly inclined to use the XCP
2.6.32 kernel.
Interesting, ok.
-John
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: [Xen-devel] OOM problems, (continued)
- Re: [Xen-devel] OOM problems, Daniel Stodden
- RE: [Xen-devel] OOM problems, Jan Beulich
- RE: [Xen-devel] OOM problems, Daniel Stodden
- RE: [Xen-devel] OOM problems, Jan Beulich
- Re: [Xen-devel] OOM problems, John Weekes
- Re: [Xen-devel] OOM problems, John Weekes
- RE: [Xen-devel] OOM problems, Ian Pratt
- Re: [Xen-devel] OOM problems, John Weekes
- RE: [Xen-devel] OOM problems, Ian Pratt
- Re: [Xen-devel] OOM problems, Daniel Stodden
- Re: [Xen-devel] OOM problems,
John Weekes <=
- Re: [Xen-devel] OOM problems, Daniel Stodden
- Re: [Xen-devel] OOM problems, John Weekes
- Re: [Xen-devel] OOM problems, Daniel Stodden
- Re: [Xen-devel] OOM problems, John Weekes
- RE: [Xen-devel] OOM problems, Stefano Stabellini
Re: [Xen-devel] OOM problems, George Shuklin
|
|
|