[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v1 00/16] dma-mapping: migrate to physical address-based API
On Fri, Aug 08, 2025 at 08:51:08PM +0200, Marek Szyprowski wrote: > First - basing the API on the phys_addr_t. > > Page based API had the advantage that it was really hard to abuse it and > call for something that is not 'a normal RAM'. This is not true anymore. Today we have ZONE_DEVICE as a struct page type with a whole bunch of non-dram sub-types: enum memory_type { /* 0 is reserved to catch uninitialized type fields */ MEMORY_DEVICE_PRIVATE = 1, MEMORY_DEVICE_COHERENT, MEMORY_DEVICE_FS_DAX, MEMORY_DEVICE_GENERIC, MEMORY_DEVICE_PCI_P2PDMA, }; Few of which are kmappable/page_to_virtable() in a way that is useful for the DMA API. DMA API sort of ignores all of this and relies on the caller to not pass in an incorrect struct page. eg we rely on things like the block stack to do the right stuff when a MEMORY_DEVICE_PCI_P2PDMA is present in a bio_vec. Which is not really fundamentally different from just using phys_addr_t in the first place. Sure, this was a stronger argument when this stuff was originally written, before ZONE_DEVICE was invented. > I initially though that phys_addr_t based API will somehow simplify > arch specific implementation, as some of them indeed rely on > phys_addr_t internally, but I missed other things pointed by > Robin. Do we have here any alternative? I think it is less of a code simplification, more as a reduction in conceptual load. When we can say directly there is no struct page type anyhwere in the DMA API layers then we only have to reason about kmap/phys_to_virt compatibly. This is also a weaker overall requirement than needing an actual struct page which allows optimizing other parts of the kernel. Like we aren't forced to create MEMORY_DEVICE_PCI_P2PDMA stuct pages just to use the dma api. Again, any place in the kernel we can get rid of struct page the smoother the road will be for the MM side struct page restructuring. For example one of the bigger eventual goes here is to make a bio_vec store phys_addr_t, not struct page pointers. DMA API is not alone here, we have been de-struct-paging the kernel for a long time now: netdev: https://lore.kernel.org/linux-mm/20250609043225.77229-1-byungchul@xxxxxx/ slab: https://lore.kernel.org/linux-mm/20211201181510.18784-1-vbabka@xxxxxxx/ iommmu: https://lore.kernel.org/all/0-v4-c8663abbb606+3f7-iommu_pages_jgg@xxxxxxxxxx/ page tables: https://lore.kernel.org/linux-mm/20230731170332.69404-1-vishal.moola@xxxxxxxxx/ zswap: https://lore.kernel.org/all/20241216150450.1228021-1-42.hyeyoo@xxxxxxxxx/ With a long term goal that struct page only exists for legacy code, and is maybe entirely compiled out of modern server kernels. > Second - making dma_map_phys() a single API to handle all cases. > > Do we really need such single function to handle all cases? If we accept the direction to remove struct page then it makes little sense to have a dma_map_ram(phys_addr) and dma_map_resource(phys_addr) and force key callers (like block) to have more ifs - especially if the conditional could become "free" inside the dma API (see below). Plus if we keep the callchain split then adding a "dma_link_resource"/etc are now needed as well. > DMA_ATTR_MMIO for every typical DMA user? I know that branching is > cheap, but this will probably increase code size for most of the typical > users for no reason. Well, having two call chains will increase the code size much more, and 'resource' can't be compiled out. Arguably this unification should reduce the .text size since many of the resource only functions go away. There are some branches, and I think the push toward re-using DMA_ATTR_SKIP_CPU_SYNC was directly to try to reduce that branch cost. However, I think we should be looking for a design here that is "free" on the fast no-swiotlb and non-cache-flush path. I think this can be achieved by checking ATTR_MMIO only after seeing swiotlb is needed (like today's is p2p check). And we can probably freely fold it into the existing sync check: if ((attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_MMIO)) == 0) I saw Leon hasn't done these micro optimizations, but it seems like it could work out. Regards, Jason
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |