[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] qemu-xen: Intel GPU passthrough, fix OpRegion mapping. (v3)



On 31 January 2012 14:51, Jean Guyader <jean.guyader@xxxxxxxxxxxxx> wrote:
> On 17/01 04:34, Stefano Stabellini wrote:
>> On Tue, 17 Jan 2012, Jean Guyader wrote:
>> > On 17/01 02:51, Stefano Stabellini wrote:
>> > > On Mon, 16 Jan 2012, Jean Guyader wrote:
>> > > > On 12 January 2012 14:34, Stefano Stabellini
>> > > > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>> > > > > On Thu, 12 Jan 2012, Jean Guyader wrote:
>> > > > >> On 12/01 12:41, Jean Guyader wrote:
>> > > > >> > The OpRegion shouldn't be mapped 1:1 because the address in the 
>> > > > >> > host
>> > > > >> > can't be used in the guest directly.
>> > > > >> >
>> > > > >> > This patch traps read and write access to the opregion of the 
>> > > > >> > Intel
>> > > > >> > GPU config space (offset 0xfc).
>> > > > >> >
>> > > > >> > To work correctly this patch needs a change in hvmloader.
>> > > > >> >
>> > > > >> > HVMloader will allocate 2 pages for the OpRegion and write this 
>> > > > >> > address
>> > > > >> > on the config space of the Intel GPU. Qemu will trap and map the 
>> > > > >> > host
>> > > > >> > OpRegion to the guest. Any write to this offset after that won't 
>> > > > >> > have
>> > > > >> > any effect. Any read of this config space offset will return the 
>> > > > >> > address
>> > > > >> > in the guest.
>> > > > >> >
>> > > > >> > Signed-off-by: Jean Guyader <jean.guyader@xxxxxxxxxxxxx>
>> > > > >> commit a6036bee23bb338e6cf48e9f0d75ff0845f8cfe3
>> > > > >> Author: Jean Guyader <jean.guyader@xxxxxxxxxxxxx>
>> > > > >> Date: ?? Wed Nov 23 07:53:30 2011 +0000
>> > > > >>
>> > > > >> ?? ?? qemu-xen: Intel GPU passthrough, fix OpRegion mapping.
>> > > > >>
>> > > > >> ?? ?? The OpRegion shouldn't be mapped 1:1 because the address in 
>> > > > >> the host
>> > > > >> ?? ?? can't be used in the guest directly.
>> > > > >>
>> > > > >> ?? ?? This patch traps read and write access to the opregion of the 
>> > > > >> Intel
>> > > > >> ?? ?? GPU config space (offset 0xfc).
>> > > > >>
>> > > > >> ?? ?? To work correctly this patch needs a change in hvmloader.
>> > > > >>
>> > > > >> ?? ?? HVMloader will allocate 2 pages for the OpRegion and write 
>> > > > >> this address
>> > > > >> ?? ?? on the config space of the Intel GPU. Qemu will trap and map 
>> > > > >> the host
>> > > > >> ?? ?? OpRegion to the guest. Any write to this offset after that 
>> > > > >> won't have
>> > > > >> ?? ?? any effect. Any read of this config space offset will return 
>> > > > >> the address
>> > > > >> ?? ?? in the guest.
>> > > > >>
>> > > > >> diff --git a/hw/pass-through.c b/hw/pass-through.c
>> > > > >> index dbe8804..7ee3c61 100644
>> > > > >> --- a/hw/pass-through.c
>> > > > >> +++ b/hw/pass-through.c
>> > > > >> @@ -238,6 +238,14 @@ static int pt_bar_reg_restore(struct pt_dev 
>> > > > >> *ptdev,
>> > > > >> ??static int pt_exp_rom_bar_reg_restore(struct pt_dev *ptdev,
>> > > > >> ?? ?? ??struct pt_reg_tbl *cfg_entry,
>> > > > >> ?? ?? ??uint32_t real_offset, uint32_t dev_value, uint32_t *value);
>> > > > >> +static int pt_intel_opregion_read(struct pt_dev *ptdev,
>> > > > >> + ?? ?? ?? ??struct pt_reg_tbl *cfg_entry,
>> > > > >> + ?? ?? ?? ??uint32_t *value, uint32_t valid_mask);
>> > > > >> +static int pt_intel_opregion_write(struct pt_dev *ptdev,
>> > > > >> + ?? ?? ?? ??struct pt_reg_tbl *cfg_entry,
>> > > > >> + ?? ?? ?? ??uint32_t *value, uint32_t dev_value, uint32_t 
>> > > > >> valid_mask);
>> > > > >> +static uint8_t pt_reg_grp_header0_size_init(struct pt_dev *ptdev,
>> > > > >> + ?? ?? ?? ??struct pt_reg_grp_info_tbl *grp_reg, uint32_t 
>> > > > >> base_offset);
>> > > > >>
>> > > > >> ??/* pt_reg_info_tbl declaration
>> > > > >> ?? * - only for emulated register (either a part or whole bit).
>> > > > >> @@ -444,6 +452,16 @@ static struct pt_reg_info_tbl 
>> > > > >> pt_emu_reg_header0_tbl[] = {
>> > > > >> ?? ?? ?? ?? ??.u.dw.write = pt_exp_rom_bar_reg_write,
>> > > > >> ?? ?? ?? ?? ??.u.dw.restore = pt_exp_rom_bar_reg_restore,
>> > > > >> ?? ?? ??},
>> > > > >> + ?? ??/* Intel IGFX OpRegion reg */
>> > > > >> + ?? ??{
>> > > > >> + ?? ?? ?? ??.offset ?? ?? = PCI_INTEL_OPREGION,
>> > > > >> + ?? ?? ?? ??.size ?? ?? ?? = 4,
>> > > > >> + ?? ?? ?? ??.init_val ?? = 0,
>> > > > >> + ?? ?? ?? ??.no_wb ?? ?? ??= 1,
>> > > > >> + ?? ?? ?? ??.u.dw.read ?? = pt_intel_opregion_read,
>> > > > >> + ?? ?? ?? ??.u.dw.write ??= pt_intel_opregion_write,
>> > > > >> + ?? ?? ?? ??.u.dw.restore ??= NULL,
>> > > > >> + ?? ??},
>> > > > >> ?? ?? ??{
>> > > > >> ?? ?? ?? ?? ??.size = 0,
>> > > > >> ?? ?? ??},
>> > > > >> @@ -737,7 +755,7 @@ static const struct pt_reg_grp_info_tbl 
>> > > > >> pt_emu_reg_grp_tbl[] = {
>> > > > >> ?? ?? ?? ?? ??.grp_id ?? ?? = 0xFF,
>> > > > >> ?? ?? ?? ?? ??.grp_type ?? = GRP_TYPE_EMU,
>> > > > >> ?? ?? ?? ?? ??.grp_size ?? = 0x40,
>> > > > >> - ?? ?? ?? ??.size_init ??= pt_reg_grp_size_init,
>> > > > >> + ?? ?? ?? ??.size_init ??= pt_reg_grp_header0_size_init,
>> > > > >> ?? ?? ?? ?? ??.emu_reg_tbl= pt_emu_reg_header0_tbl,
>> > > > >> ?? ?? ??},
>> > > > >> ?? ?? ??/* PCI PowerManagement Capability reg group */
>> > > > >> @@ -3006,6 +3024,19 @@ static uint32_t pt_msixctrl_reg_init(struct 
>> > > > >> pt_dev *ptdev,
>> > > > >> ?? ?? ??return reg->init_val;
>> > > > >> ??}
>> > > > >>
>> > > > >> +static uint8_t pt_reg_grp_header0_size_init(struct pt_dev *ptdev,
>> > > > >> + ?? ?? ?? ??struct pt_reg_grp_info_tbl *grp_reg, uint32_t 
>> > > > >> base_offset)
>> > > > >> +{
>> > > > >> + ?? ??/*
>> > > > >> + ?? ??** By default we will trap up to 0x40 in the cfg space.
>> > > > >> + ?? ??** If an intel device is pass through we need to trap 0xfc,
>> > > > >> + ?? ??** therefore the size should be 0xff.
>> > > > >> + ?? ??*/
>> > > > >> + ?? ??if (igd_passthru)
>> > > > >> + ?? ?? ?? ??return 0xFF;
>> > > > >> + ?? ??return grp_reg->grp_size;
>> > > > >> +}
>> > > > >
>> > > > > Apart from the trivial code style error in the comment above, is this
>> > > > > going to have the unintended side effect of initializing as 0 all the
>> > > > > emulated registers between 0x40 and 0xff, that previously were 
>> > > > > probably
>> > > > > passed through?
>> > > > >
>> > > >
>> > > > Based on how pt_find_reg_grp is implemented that doesn't make any 
>> > > > difference.
>> > >
>> > > actually there is a small change in behaviour: before your patch
>> > > pt_find_reg_grp would return NULL for any cfg register between 0x40 and
>> > > 0xff. Now if igd_passthru is set pt_find_reg_grp would return the
>> > > reg_grp_entry corresponding to "Header Type0 reg group" and then
>> > > pt_find_reg would return NULL.
>> > > This case seems to be handled correctly and bring to the same results
>> > > in both pt_pci_write_config and pt_pci_read_config.
>> > >
>> > > In any case PCI_INTEL_OPREGION should be part of "Header Type0 reg
>> > > group" only it if really is part of this group otherwise it should be in
>> > > its own separate group.
>> >
>> > The pci pass through groups have been designed to pass through pci 
>> > capabilities
>> > from the device. You can't really create a group for something which isn't 
>> > a pci
>> > capability.
>> >
>> > I have noted the change of behavior but that doesn't have any impact on 
>> > what we
>> > will return to the guest so I think it fine.
>>
>> in that case, ack
>
> Ian,
>
> Could you apply this patch please?
>
> Jean

On 31 January 2012 14:51, Jean Guyader <jean.guyader@xxxxxxxxxxxxx> wrote:
>
> On 17/01 04:34, Stefano Stabellini wrote:
> > On Tue, 17 Jan 2012, Jean Guyader wrote:
> > > On 17/01 02:51, Stefano Stabellini wrote:
> > > > On Mon, 16 Jan 2012, Jean Guyader wrote:
> > > > > On 12 January 2012 14:34, Stefano Stabellini
> > > > > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > > > > > On Thu, 12 Jan 2012, Jean Guyader wrote:
> > > > > >> On 12/01 12:41, Jean Guyader wrote:
> > > > > >> > The OpRegion shouldn't be mapped 1:1 because the address in the 
> > > > > >> > host
> > > > > >> > can't be used in the guest directly.
> > > > > >> >
> > > > > >> > This patch traps read and write access to the opregion of the 
> > > > > >> > Intel
> > > > > >> > GPU config space (offset 0xfc).
> > > > > >> >
> > > > > >> > To work correctly this patch needs a change in hvmloader.
> > > > > >> >
> > > > > >> > HVMloader will allocate 2 pages for the OpRegion and write this 
> > > > > >> > address
> > > > > >> > on the config space of the Intel GPU. Qemu will trap and map the 
> > > > > >> > host
> > > > > >> > OpRegion to the guest. Any write to this offset after that won't 
> > > > > >> > have
> > > > > >> > any effect. Any read of this config space offset will return the 
> > > > > >> > address
> > > > > >> > in the guest.
> > > > > >> >
> > > > > >> > Signed-off-by: Jean Guyader <jean.guyader@xxxxxxxxxxxxx>
> > > > > >> commit a6036bee23bb338e6cf48e9f0d75ff0845f8cfe3
> > > > > >> Author: Jean Guyader <jean.guyader@xxxxxxxxxxxxx>
> > > > > >> Date: ?? Wed Nov 23 07:53:30 2011 +0000
> > > > > >>
> > > > > >> ?? ?? qemu-xen: Intel GPU passthrough, fix OpRegion mapping.
> > > > > >>
> > > > > >> ?? ?? The OpRegion shouldn't be mapped 1:1 because the address in 
> > > > > >> the host
> > > > > >> ?? ?? can't be used in the guest directly.
> > > > > >>
> > > > > >> ?? ?? This patch traps read and write access to the opregion of 
> > > > > >> the Intel
> > > > > >> ?? ?? GPU config space (offset 0xfc).
> > > > > >>
> > > > > >> ?? ?? To work correctly this patch needs a change in hvmloader.
> > > > > >>
> > > > > >> ?? ?? HVMloader will allocate 2 pages for the OpRegion and write 
> > > > > >> this address
> > > > > >> ?? ?? on the config space of the Intel GPU. Qemu will trap and map 
> > > > > >> the host
> > > > > >> ?? ?? OpRegion to the guest. Any write to this offset after that 
> > > > > >> won't have
> > > > > >> ?? ?? any effect. Any read of this config space offset will return 
> > > > > >> the address
> > > > > >> ?? ?? in the guest.
> > > > > >>
> > > > > >> diff --git a/hw/pass-through.c b/hw/pass-through.c
> > > > > >> index dbe8804..7ee3c61 100644
> > > > > >> --- a/hw/pass-through.c
> > > > > >> +++ b/hw/pass-through.c
> > > > > >> @@ -238,6 +238,14 @@ static int pt_bar_reg_restore(struct pt_dev 
> > > > > >> *ptdev,
> > > > > >> ??static int pt_exp_rom_bar_reg_restore(struct pt_dev *ptdev,
> > > > > >> ?? ?? ??struct pt_reg_tbl *cfg_entry,
> > > > > >> ?? ?? ??uint32_t real_offset, uint32_t dev_value, uint32_t *value);
> > > > > >> +static int pt_intel_opregion_read(struct pt_dev *ptdev,
> > > > > >> + ?? ?? ?? ??struct pt_reg_tbl *cfg_entry,
> > > > > >> + ?? ?? ?? ??uint32_t *value, uint32_t valid_mask);
> > > > > >> +static int pt_intel_opregion_write(struct pt_dev *ptdev,
> > > > > >> + ?? ?? ?? ??struct pt_reg_tbl *cfg_entry,
> > > > > >> + ?? ?? ?? ??uint32_t *value, uint32_t dev_value, uint32_t 
> > > > > >> valid_mask);
> > > > > >> +static uint8_t pt_reg_grp_header0_size_init(struct pt_dev *ptdev,
> > > > > >> + ?? ?? ?? ??struct pt_reg_grp_info_tbl *grp_reg, uint32_t 
> > > > > >> base_offset);
> > > > > >>
> > > > > >> ??/* pt_reg_info_tbl declaration
> > > > > >> ?? * - only for emulated register (either a part or whole bit).
> > > > > >> @@ -444,6 +452,16 @@ static struct pt_reg_info_tbl 
> > > > > >> pt_emu_reg_header0_tbl[] = {
> > > > > >> ?? ?? ?? ?? ??.u.dw.write = pt_exp_rom_bar_reg_write,
> > > > > >> ?? ?? ?? ?? ??.u.dw.restore = pt_exp_rom_bar_reg_restore,
> > > > > >> ?? ?? ??},
> > > > > >> + ?? ??/* Intel IGFX OpRegion reg */
> > > > > >> + ?? ??{
> > > > > >> + ?? ?? ?? ??.offset ?? ?? = PCI_INTEL_OPREGION,
> > > > > >> + ?? ?? ?? ??.size ?? ?? ?? = 4,
> > > > > >> + ?? ?? ?? ??.init_val ?? = 0,
> > > > > >> + ?? ?? ?? ??.no_wb ?? ?? ??= 1,
> > > > > >> + ?? ?? ?? ??.u.dw.read ?? = pt_intel_opregion_read,
> > > > > >> + ?? ?? ?? ??.u.dw.write ??= pt_intel_opregion_write,
> > > > > >> + ?? ?? ?? ??.u.dw.restore ??= NULL,
> > > > > >> + ?? ??},
> > > > > >> ?? ?? ??{
> > > > > >> ?? ?? ?? ?? ??.size = 0,
> > > > > >> ?? ?? ??},
> > > > > >> @@ -737,7 +755,7 @@ static const struct pt_reg_grp_info_tbl 
> > > > > >> pt_emu_reg_grp_tbl[] = {
> > > > > >> ?? ?? ?? ?? ??.grp_id ?? ?? = 0xFF,
> > > > > >> ?? ?? ?? ?? ??.grp_type ?? = GRP_TYPE_EMU,
> > > > > >> ?? ?? ?? ?? ??.grp_size ?? = 0x40,
> > > > > >> - ?? ?? ?? ??.size_init ??= pt_reg_grp_size_init,
> > > > > >> + ?? ?? ?? ??.size_init ??= pt_reg_grp_header0_size_init,
> > > > > >> ?? ?? ?? ?? ??.emu_reg_tbl= pt_emu_reg_header0_tbl,
> > > > > >> ?? ?? ??},
> > > > > >> ?? ?? ??/* PCI PowerManagement Capability reg group */
> > > > > >> @@ -3006,6 +3024,19 @@ static uint32_t pt_msixctrl_reg_init(struct 
> > > > > >> pt_dev *ptdev,
> > > > > >> ?? ?? ??return reg->init_val;
> > > > > >> ??}
> > > > > >>
> > > > > >> +static uint8_t pt_reg_grp_header0_size_init(struct pt_dev *ptdev,
> > > > > >> + ?? ?? ?? ??struct pt_reg_grp_info_tbl *grp_reg, uint32_t 
> > > > > >> base_offset)
> > > > > >> +{
> > > > > >> + ?? ??/*
> > > > > >> + ?? ??** By default we will trap up to 0x40 in the cfg space.
> > > > > >> + ?? ??** If an intel device is pass through we need to trap 0xfc,
> > > > > >> + ?? ??** therefore the size should be 0xff.
> > > > > >> + ?? ??*/
> > > > > >> + ?? ??if (igd_passthru)
> > > > > >> + ?? ?? ?? ??return 0xFF;
> > > > > >> + ?? ??return grp_reg->grp_size;
> > > > > >> +}
> > > > > >
> > > > > > Apart from the trivial code style error in the comment above, is 
> > > > > > this
> > > > > > going to have the unintended side effect of initializing as 0 all 
> > > > > > the
> > > > > > emulated registers between 0x40 and 0xff, that previously were 
> > > > > > probably
> > > > > > passed through?
> > > > > >
> > > > >
> > > > > Based on how pt_find_reg_grp is implemented that doesn't make any 
> > > > > difference.
> > > >
> > > > actually there is a small change in behaviour: before your patch
> > > > pt_find_reg_grp would return NULL for any cfg register between 0x40 and
> > > > 0xff. Now if igd_passthru is set pt_find_reg_grp would return the
> > > > reg_grp_entry corresponding to "Header Type0 reg group" and then
> > > > pt_find_reg would return NULL.
> > > > This case seems to be handled correctly and bring to the same results
> > > > in both pt_pci_write_config and pt_pci_read_config.
> > > >
> > > > In any case PCI_INTEL_OPREGION should be part of "Header Type0 reg
> > > > group" only it if really is part of this group otherwise it should be in
> > > > its own separate group.
> > >
> > > The pci pass through groups have been designed to pass through pci 
> > > capabilities
> > > from the device. You can't really create a group for something which 
> > > isn't a pci
> > > capability.
> > >
> > > I have noted the change of behavior but that doesn't have any impact on 
> > > what we
> > > will return to the guest so I think it fine.
> >
> > in that case, ack
>
> Ian,
>
> Could you apply this patch please?
>

Hi Ian,

I was going through my email and it seems that this patch didn't make
it into qemu-xen-unstable.

Signed-off-by: Jean Guyader <jean.guyader@xxxxxxxxx>

Thanks,
Jean

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.