[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN PATCH] tools/libs/light/libxl_pci.c: explicitly grant access to Intel IGD opregion



Good day,

You're looking for this document i was focusing on a week ago, i believe. Anyway, here's the url to this doc below:


https://tutiendafit.com.mx/olea/sndaem

On 3/15/22 7:38 AM, Jan Beulich wrote: > On 14.03.2022 04:41, Chuck Zmudzinski wrote: >> Fixes: abfb006f1ff4 (tools/libxl: explicitly grant access to needed I/O-memory ranges) >> Fixes: 0561e1f01e87 (xen/common: do not implicitly permit access to mapped I/O memory) >> Backport: 4.12+ > Just fyi: This is fine to have as a tag, but it wouldn't be backported > farther than to 4.15. That's entirely reasonable. I understand this is a bug fix, not a security issue. > > Apart from this largely some style issues (see below), but please > realize that I'm not a libxl maintainer and hence I may not have good > enough knowledge of, in particular, potential unwritten conventions. I will take your comments into consideration regarding style before writing the next version of the patch, and carefully check libxl's coding style file. > >> @@ -610,6 +612,45 @@ out: >> return ret; >> } >> >> +static uint32_t sysfs_dev_get_igd_opregion(libxl__gc *gc, >> + libxl_device_pci *pci) >> +{ >> + char *pci_device_config_path = >> + GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/config", >> + pci->domain, pci->bus, pci->dev, pci->func); >> + size_t read_items; >> + uint32_t igd_opregion; >> + uint32_t error = 0xffffffff; > I think this constant wants to gain a #define, to be able to correlate > the use sites. I'm also not sure of the value - in principle the > register can hold this value, but of course then it won't be 3 pages. What we are storing as the return value is the starting address, not the size, of the opregion, and that is a 32-bit value. If we cannot read it, we return 0xffffffff instead to indicate the error that we were unable to read the starting address of the opregion from the config attribute in sysfs. The 32-bit value we are looking for is read at offset 0xfc from the start of the config attribute of the Intel IGD in sysfs. The offset 0xfc is defined by PCI_INTEL_OPREGION both here and in hvmloader (and also in Qemu). The data that is read at this offset from the start of the config attribute of the Intel IGD in sysfs is the 32-bit address of the start of the opregion. > Maybe the error check further down should be to see whether adding 2 > to the value would overflow in 32 bits? (In that case a #define may > not be needed anymore, as there wouldn't be multiple instances of the > constant in the code.) That would work also. Please not that I chose that value for an error value consistent with the way the current function sysfs_dev_get_vendor does it. While that function does not assign the variable 'error' to its return value for an error, which in that case is 0xffff because that function returns uint16_t instead of uint32_t, I chose to explicitly assign the error variable to that value to help make the code more readable. Your� comment that this could be a #define instead is good. I also think we should use a #define for the error return value of the sysfs_dev_get_vendor function Something like: #define ERROR_16��� 0xffff #define ERROR_32��� 0xffffffff might be appropriate. But that would be touching code unrelated to this bug fix. I think again the libxl maintainers should weigh in about what to do here. They might let me take this opportunity to update and improve the style of the patched file in other functions in the file not related to this bug fix but I am not inclined to do that without an explicit request from them to do so. So I am not sure yet what I will do in the next version of the patch, but I will address your concerns here and try to explain my reasoning for the changes in the changelog for version 2 of the patch. > >> + >> + FILE *f = fopen(pci_device_config_path, "r"); >> + if (!f) { > While libxl has some special style rules, I think it still wants a > blank line between declaration(s) and statement(s), just like we > expect elsewhere. Effectively you want to simply move the blank line > you have one line down. I think I followed the same style here as the existing sysfs_dev_get_xxx functions. I will double check that and use the same style the other functions use unless they clearly violate the rules, and note that I deviated from the style of the existing functions to conform to current coding style and suggest a subsequent patch to update the style of the other functions. > >> @@ -2531,6 +2572,37 @@ int libxl__grant_vga_iomem_permission(libxl__gc *gc, const uint32_t domid, >> domid, vga_iomem_start, (vga_iomem_start + 0x20 - 1)); >> return ret; >> } >> + >> + /* If this is an Intel IGD, allow access to the IGD opregion */ >> + if (!libxl__is_igd_vga_passthru(gc, d_config)) return 0; > Despite the provision for "return" or alike to go on the same line > as an error code check, I don't think this is okay here. It would be > if, as iirc generally expected in libxl, you latched the function > return value into a local variable named "rc" (I think). I will double check how the function being patched handles the return value. I don't even remember if it has a local variable named rc for a return value. IIRC it was either ret or 0. I understand that libxl expects rc to be used these days, though. This might be another candidate for updating the file to libxl's current standards. > >> + uint32_t igd_opregion = sysfs_dev_get_igd_opregion(gc, pci); >> + uint32_t error = 0xffffffff; > Please don't mix declarations and statements. I presume you are saying these two lines should be re-written as: uint32_t igd_opregion; unti32_t error; igd_opregion = sysfs_dev_get_igd_opregion(gc, pci); error = 0xffff; Please reply if my understanding here is not correct. > I also don't think > "error" is really necessary as a local variable, but with the change > suggested above it might disappear anyway. I do plan for the next version of the patch to use a #define for this instead of the error variable (or add 2 to overflow it), so it will disappear in the next version. > >> + if (igd_opregion == error) break; > Like above I'm not sure this is okay to all live on one line. I also > think it would be nice if you used "return 0" or "break" consistently. > Of course a related question is whether failure here should actually > be reported to the caller. Good points here. I agree about consistency with break and return 0. I will change this to return 0 and move it to the next line. I do not want to change the current meaning of the return value without knowledge of how the caller uses the return value. IIRC, currently the function always returns 0 unless it encounters a negative return value from xc_domain_iomem_permission, in which case it returns that negative value to indicate an error to the caller. So if we return anything other than 0 here, we might be returning an error code that the caller does not expect or interpret correctly. I will also consider putting an error message here before returning 0. A message something like "dom%d: Intel IGD detected, but could not find IGD opregion" would explain the error that happens here. I don't think a non-zero error code to the caller is appropriate here, though, because, as already mentioned, IIRC this might return a value the caller does not interpret correctly. If it is necessary to return an error to the caller here instead of 0, it will be necessary to ensure all callers of this function will interpret it correctly. I would suggest an error return value greater than 0 to distinguish it from the return value <> >> + vga_iomem_start = ( (uint64_t) igd_opregion ) >> XC_PAGE_SHIFT; > There's no need for a cast here, as you're right-shifting. Also > (just fyi) there would have been three to many spaces here. I'm > additionally not certain whether re-using a variable for a purpose > not matching its name is deemed acceptable by libxl maintainers. I wrote it that way expecting a compiler error if I didn't do the cast. I have not checked if the cast is necessary, though, and maybe you are right. I will check and see if it is necessary by removing the cast and see if the compiler complains. If the cast is not needed, I will just use the 32-bit igd_opregion variable when calling xc_domain_iomem_permission instead of the 64-bit vga_iomem_start variable. I will remove the three spaces and use a more descriptive variable instead of re-using vga_iomem_start if the compiler insists on the cast from 32-bit to 64-bit. > >> + ret = xc_domain_iomem_permission(CTX->xch, stubdom_domid, >> + vga_iomem_start, >> + IGD_OPREGION_PAGES, 1); >> + if (ret < 0) { >> + LOGED(ERROR, domid, >> + "failed to give stubdom%d access to iomem range " >> + "%"PRIx64"-%"PRIx64" for IGD passthru", >> + stubdom_domid, vga_iomem_start, (vga_iomem_start + >> + IGD_OPREGION_PAGES - 1)); >> + return ret; >> + } > I have to admit that I find it odd that this is done unconditionally, > but I notice the same is done in pre-existing code. I would have > expected this to happen only when there actually is a device model > stub domain. I don't understand how that works either. All my tests have been with the device model running as a process in dom0. I am thinking maybe in that case it just uses dom0 for the stub domain, but I have not checked that. I will check it by dumping the value of stubdom_domid to a log in my next test. Thank you for responding promptly. Now I have some work to do writing the next version of the patch and documenting it clearly in its changelog. It will take me a while - I will spend enough time on it so hopefully the libxl maintainers don't have to spend so much time on it. Chuck N.B. I forgot to send this reply to xen-devel and cc the libxl maintainers, so I am doing so here. I also re-formatted my replies to avoid lines with too many characters. Sorry for the confusion.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.