|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Pinned, non-revocable mappings of VRAM: will bad things happen?
On 4/17/26 03:53, Christian König wrote: > On 4/16/26 18:13, Demi Marie Obenour wrote: >> On 4/16/26 05:57, Christian König wrote: >>> On 4/16/26 01:27, Demi Marie Obenour wrote: >>>> Is it safe to assume that if a dmabuf exporter cannot handle >>>> non-revocable, pinned importers, it will fail the import? Or is >>>> using dma_buf_pin() unsafe if one does not know the exporter? >>> >>> Neither. >>> >>> dma_buf_pin() makes sure that the importer doesn't get any invalidation >>> notifications because the exporter moves the backing store of the buffer >>> around for memory management. >>> >>> But what is still possible is that the exporter is hot removed, in which >>> case the importer should basically terminate it's DMA operation as soon as >>> possible. >>> >>> GPU drivers usually reject pin requests to VRAM from DMA-buf importers when >>> that isn't restricted by cgroups for example, because that can otherwise >>> easily result in a deny of service. >>> >>> Amdgpu only recently started to allow pinning into VRAM to support RDMA >>> without ODP (I think it was ODP, but could be that I mixed up the RDMA >>> three letter code for that feature). >>> >>>> For context, Xen grant tables do not support revocation. One can ask >>>> the guest to unmap the grants, but if the guest doesn't obey the only >>>> recourse is to ungracefully kill it. They also do not support page >>>> faults, so the pages must be pinned. Right now, grant tables don't >>>> support PCI BAR mappings, but that's fixable. >>> >>> That sounds like an use case for the DMA-buf pin interface. >>> >>>> How badly is this going to break with dGPU VRAM, if at all? I know >>>> that AMDGPU has a fallback when the BAR isn't mappable. What about >>>> other drivers? Supporting page faults the way KVM does is going to >>>> be extremely hard, so pinned mappings and DMA transfers are vastly >>>> preferable. >>> >>> Well if you only want to share a fixed amount of VRAM then that is pretty >>> much ok. >>> >>> But when the client VM can trigger pinning on demand without any limitation >>> you can pretty easily have deny of service against the host. That is >>> usually a rather bad idea. >> >> Is there a reasonable way to choose such an amount? > > Not really. > >> Unless I am >> mistaken, client workloads are highly non-uniform: a single game or >> compute job might well use more VRAM than every other program on the >> system combined. > > Yeah, perfectly correct. > >> Are these workloads impossible to make work well with pinning? > > No, as long as you don't know the workload beforehand, e.g. when you define > the limit. > > I mean that's why basically everybody avoids pinning and assigning fixed > amounts of resources. > > Even if you can make it work technically pinning usually results in a rather > bad end user experience. > > Regards, > Christian. Do drivers and programs assume that they can access VRAM from the CPU? Are any of the following reasonable options? 1. Change the guest kernel to only map (and thus pin) a small subset of VRAM at any given time. If unmapped VRAM is accessed the guest traps the page fault, evicts an old VRAM mapping, and creates a new one. 2. Pretend that resizable BAR is not enabled, so the guest doesn't think it can map much of VRAM at once. If resizable BAR is enabled on the host, it might be possible to split the large BAR mapping in a lot of ways. Or does Xen really need to allow the host to handle guest page faults? That adds a huge amount of complexity to trusted and security-critical parts of the system, so it really is a last resort. Putting the complexity in to the guest virtio-GPU driver is vastly preferable if it can be made to work well. -- Sincerely, Demi Marie Obenour (she/her/hers) Attachment:
OpenPGP_0xB288B55FFF9C22C1.asc Attachment:
OpenPGP_signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |