This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: blktap: Sync with XCP, dropping zero-copy.

On 11/17/2010 12:21 PM, Daniel Stodden wrote:
> And, like all granted frames, not owning them implies they are not
> resolvable via mfn_to_pfn, thereby failing in follow_page, thereby gup()
> without the VM_FOREIGN hack.

Hm, I see.  Well, I wonder if using _PAGE_SPECIAL would help (it is put
on usermode ptes which don't have a backing struct page).  After all,
there's no fundamental reason why it would need a pfn; the mfn in the
pte is what's actually needed to ultimately generate a DMA descriptor.

> Correct me if I'm mistaken. I used to be quicker looking up stuff on
> arch-xen kernels, but I think fundamental constants of the Xen universe
> didn't change since last time.

No, but Linux has.

> [
> Part of the reason why blktap *never* frees those pages, apart from
> being slightly greedy, are deadlock hazards when writing those nodes in
> dom0 through the pagecache, as dom0 might. You need memory pools on the
> datapath to guarantee progress under pressure. That got pretty ugly
> after 2.6.27, btw.
> ]

That's what mempools are intended to solve.

> In any case, let's skip trying what happens if a thundering herd of
> several hundred userspace disks tries gfp()ing their grant slots out of
> dom0 without without arbitration.

I'm not against arbitration, but I don't think that's something that
should be implemented as part of a Xen driver.

>>> I guess we've been meaning the same thing here, unless I'm
>>> misunderstanding you. Any pfn does, and the balloon pagevec allocations
>>> default to order 0 entries indeed. Sorry, you're right, that's not a
>>> 'range'. With a pending re-xmit, the backend can find a couple (or all)
>>> of the request frames have count>1. It can flip and abandon those as
>>> normal memory. But it will need those lost memory slots back, straight
>>> away or next time it's running out of frames. As order-0 allocations.
>> Right.  GFP_KERNEL order 0 allocations are pretty reliable; they only
>> fail if the system is under extreme memory pressure.  And it has the
>> nice property that if those allocations block or fail it rate limits IO
>> ingress from domains rather than being crushed by memory pressure at the
>> backend (ie, the problem with trying to allocate memory in the writeout
>> path).
>> Also the cgroup mechanism looks like an extremely powerful way to
>> control the allocations for a process or group of processes to stop them
>> from dominating the whole machine.
> Ah. In case it can be put to work to bind processes allocating pagecache
> entries for dirtying to some boundary, I'd be really interested. I think
> I came across it once but didn't take the time to read the docs
> thoroughly. Can it?

I'm not sure about dirtyness - it seems like something that should be
within its remit, even if it doesn't currently have it.

The cgroup mechanism is extremely powerful, now that I look at it.  You
can do everything from setting block IO priorities and QoS parameters to
CPU limits.


Xen-devel mailing list