[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.





On 10/23/2017 7:27 PM, Andre Przywara wrote:
Hi Manish,

On 12/10/17 22:03, Manish Jaggi wrote:
ACPI/IORT Support in Xen.
--------------------------------------

I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending
the scope
and including all that is required to support ACPI/IORT in Xen.
Presenting for review
first _draft_ of design of ACPI/IORT support in Xen. Not complete though.

Discussed is the parsing and generation of IORT table for Dom0 and DomUs.
It is proposed that IORT be parsed and the information in saved into xen
data-structure
say host_iort_struct and is reused by all xen subsystems like ITS / SMMU
etc.

Since this is first draft is open to technical comments, modifications
and suggestions. Please be open and feel free to add any missing points
/ additions.

1. What is IORT. What are its components ?
2. Current Support in Xen
3. IORT for Dom0
4. IORT for DomU
5. Parsing of IORT in Xen
6. Generation of IORT
7. Future Work and TODOs

1. What is IORT. What are its components ?
--------------------------------------------
IORT refers to Input Output remapping table. It is essentially used to find
information about the IO topology (PCIRC-SMMU-ITS) and relationships
between
devices.

A general structure of IORT is has nodes which have information about
PCI RC,
SMMU, ITS and Platform devices. Using an IORT table relationship between
RID -> StreamID -> DeviceId can be obtained. More specifically which
device is
behind which SMMU and which interrupt controller, this topology is
described in
IORT Table.

RID is a requester ID in PCI context,
StreamID is the ID of the device in SMMU context,
DeviceID is the ID programmed in ITS.

For a non-pci device RID could be simply an ID.

Each iort_node contains an ID map array to translate from one ID into
another.
IDmap Entry {input_range, output_range, output_node_ref, id_count}
This array is present in PCI RC node,SMMU node, Named component node etc
and can reference to a SMMU or ITS node.

2. Current Support of IORT
---------------------------
Currently Xen passes host IORT table to dom0 without any modifications.
For DomU no IORT table is passed.

3. IORT for Dom0
-----------------
IORT for Dom0 is prepared by xen and it is fairly similar to the host iort.
However few nodes could be removed removed or modified. For instance
- host SMMU nodes should not be present
- ITS group nodes are same as host iort but, no stage2 mapping is done
for them.
What do you mean with stage2 mapping?
Please ignore this line. Copy paste error. Read it as follows

- ITS group nodes are same as host iort.
(though I would modify the same as in next draft)


- platform nodes (named components) may be selectively present depending
on the case where xen is using some. This could be controlled by  xen command
line.
Mmh, I am not so sure platform devices described in the IORT (those
which use MSIs!) are so much different from PCI devices here. My
understanding is those platform devices are network adapters, for
instance, for which Xen has no use.
ok.
So I would translate "Named Components" or "platform devices" as devices
just not using the PCIe bus (so no config space and no (S)BDF), but
being otherwise the same from an ITS or SMMU point of view.
Correct.
- More items : TODO
I think we agreed upon rewriting the IORT table instead of patching it?
yes. In fact if you look at my patch v2 on IORT SMMU hiding, it was _rewriting_ most of Dom0 IORT and not patching it.
We can have a IRC discussion on this.

I think apart from rewriting, the other tasks which were required that are handled in this epic task
- parse IORT and save in xen internal data structures
- common code to generate IORT for dom0/domU
- All xen code that parses IORT multiple times use now the xen internal data structures.
(I have explained this in this mail below)
So to some degree your statements are true, but when we rewrite the IORT
table without SMMUs (and possibly without other components like the
PMUs), it would be kind of a stretch to call it "fairly similar to the
host IORT". I think "based on the host IORT" would be more precise.
Yes. Based on host IORT is better,thanks.

4. IORT for DomU
-----------------
IORT for DomU is generated by the toolstack. IORT topology is different
when DomU supports device passthrough.
Can you elaborate on that? Different compared to what? My understanding
is that without device passthrough there would be no IORT in the first
place?
I was exploring the possibility of having virtual devices for DomU.
So if a virtual is assigned to guest there needs to be some mapping in IORT as well.
This virtual device can be on a PCI bus / or as a platform device.

Device Pass-through can be split into two parts
a. platform device passthrough (not on PCI bus)
b. PCI device PT

=> If we discount the possibility of a virtual device for domU and platform device passthrough
 then you are correct no IORT is required.

When PCI device passthrough is supported, the PCIRC is itself virtual (emulated by Xen).
One can have any number of virtual PCIRC  and may be virtual SMMUs.
Hence the topology can vary.

Now read the below lines.
At a minimum domU IORT should include a single PCIRC and ITS Group.
Similar PCIRC can be added in DSDT.
Additional node can be added if platform device is assigned to domU.
No extra node should be required for PCI device pass-through.
Again I don't fully understand this last sentence.
The last line is continuation of the first line "At a minimum..."

It is proposed that the idrange of PCIRC and ITS group be constant for
domUs.
"constant" is a bit confusing here. Maybe "arbitrary", "from scratch" or
"independent from the actual h/w"?
ok. that is implementation defined.

In case if PCI PT,using a domctl toolstack can communicate
physical RID: virtual RID, deviceID: virtual deviceID to xen.

It is assumed that domU PCI Config access would be trapped in Xen. The
RID at which assigned device is enumerated would be the one provided by the
domctl, domctl_set_deviceid_mapping

TODO: device assign domctl i/f.
Note: This should suffice the virtual deviceID support pointed by Andre.
[4]
Well, there's more to it. First thing: while I tried to include virtual
ITS deviceIDs to be different from physical ones, in the moment there
are fixed to being mapped 1:1 in the code.
oh
So the first step would be to go over the ITS code and identify where
"devid" refers to a virtual deviceID and where to a physical one
(probably renaming them accordingly). Then we would need a function to
translate between the two. At the moment this would be a dummy function
(just return the input value). Later we would loop in the actual table.
Some thought here..
Wouldn't it be better to call a helper function to translate the devid coming from guest. The helper function would look at the table created by handling successive domctls (the one mentioned here)

We might not need this domctl if assign_device hypercall is extended to
provide this information.
Do we actually need a new interface or even extend the existing one?
If I got Julien correctly, the existing interface is just fine?
Could you explain which existing interface can be used to translate guest device ID to host device ID when an ITS command gets trapped in Xen.
may be I am missing something here.
5. Parsing of IORT in Xen
--------------------------
IORT nodes can be saved in structures so that IORT table parsing can be
done once and is reused by all xen subsystems like ITS / SMMU etc, domain
creation.
Proposed are the structures to hold IORT information, very similar to ACPI
structures.

iort_id_map {
     range_t input_range;
     range_t output_range;
     void *output_reference;
...
}
I guess you would need a "struct list_head list" here to chain the ranges?
yes :). That was in ...

=>output_reference points to object of iort_node.

struct iort_node {
     struct list_head id_map;
     void *context;
     struct list_head list;
}
=> context could be a reference to acpi_iort_node.

struct iort_table_struct {
     struct list_head pci_rc_nodes;
     struct list_head smmu_nodes;
     struct list_head plat_devices;
     struct list_head its_group;
}
So quickly brainstorming with Julien I was wondering if we could
actually simplify this significantly:
 From Xen's point of view all we need to know is the mapping between PCI
requestor IDs (or some platform device IDs) to the physical ITS device
ID, and from requestor IDs to the SMMU stream ID.
That would be just *two* lookup tables, not connected to each other
aside from possibly having the same input ranges. At this point we could
also have *one* table, containing both the ITS deviceID and the SMMU
stream ID:

struct iort_id_map {
        range_t input_range;
        uint32_t its_devid_base;
        uint32_t smmu_streamid_base;
        struct list_head list;
};
This is just a simpler case of what the spec supports.
Each PCIRC node can have an array of idmaps, each id map entry can have a multiple of idmap entries in
the output reference smmu idmap.

I had a similar discussion with the v1 version of my IORT SMMU hidepatch with julien.

Moreover I dont quite understand the where iort_id_map would fit.
As if you see below reply, we have a other things to take care as well

So parsing the IORT would create and fill a list of those structures.
For a lookup we would just iterate over that list, find a matching entry
and:
return (input_id - match->input_range.base) + match->its_devid_base;

Ideally we abstract this via some functions, so that we can later swap
this for more efficient data structures should the need arise.

This structure is created at the point IORT table is parsed say from
acpi_iort_init.
It is proposed to use this structure information in
iort_init_platform_devices.
[2] [RFC v2 4/7] ACPI: arm: Support for IORT
I guess you missed this part. Without this the  context is missed.

The main purpose of this whole task is split into two parts
a. IORT parsing should be done once in Xen and later whenever IORT parsing is required use xen internal data structures
      (iort_table_struct)

    - this would be helpful for (b) below
    - and for SMMU / platform devices initialization.
      If you see [2], it again parses the IORT.
So the approach here is [2] and (b) should use same Xen internal IORT data structures.

=> For that reason [2]/[5] might need to be rebased on this task's patch. <=
[5] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123080.html

b. Generate IORT for Doms without patching Host IORT, rather regenerate from xen internal data structures.

based on this rationale, I think the data structures mentioned would be required.


6. IORT Generation
-------------------
There would be a common code to generate IORT table from iort_table_struct.
That sounds useful, but we would need to be careful with sharing code
between Xen and the tool stack. Has this actually been done before?
I added the code sharing part here, but I am not hopeful that this would work as it would require lot of code change on toolstack. A simple difference is that the acpi header structures have different member variables. This is same for other structures. So we might have to create a lot of defines in common code for sharing and possibility of errors.

See: struct acpi_header in acpi2_0.h (tools/libacpi)
and struct acpi_table_header in actbl.h (xen/include/acpi)

That is why I preferred a domctl, so xen coud prepare IORT for DomU.

If not code sharing then code duplication might also work (In that case no domctl required)
We can discuss on this more...

a. For Dom0
     the structure (iort_table_struct) be modified to remove smmu nodes
     and update id_mappings.
     PCIRC idmap -> output refrence to ITS group.
     (RID -> DeviceID).

     TODO: Describe algo in update_id_mapping function to map RID ->
DeviceID used
     in my earlier patch [3]
If the above approach works, this would become a simple list iteration,
creating PCI rc nodes with the appropriate pointer to the ITS nodes.
Yes, it works. see [3]

b. For DomU
     - iort_table_struct would have minimal 2 nodes (1 PCIRC and 1 ITS
group)
     - populate a basic IORT in a buffer passed by toolstack( using a
domctl : domctl_prepare_dom_iort)
I think we should reduce this to iterating the same data structure as
for Dom0. Each pass-through-ed PCI device would possibly create one
struct instance, and later on we do the same iteration as we do for
Dom0. If that proves to be simple enough, we might even live with the
code duplication between Xen and the toolstack.
Yes, thats the Idea. for domu and dom0 the IORT generation code would take

iort_table_struct
as input.


Cheers,
Andre.

     - DSDT for the DomU is updated by toolstack to include a PCIRC.
     - If a named component is added to domU that information is passed
in the
     same/additional domctl.
         - <TODO: domctl_prepare_dom_iort i/f >
     Note: Julien I have tried to incorporate your suggestion for code
reuse.

7. References:
-------------
[0] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg121667.html
[1] ARM DEN0049C:
http://infocenter.arm.com/help/topic/com.arm.doc.den0049c/DEN0049C_IO_Remapping_Table.pdf

[2] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123082.html
[3] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg121669.html:
update_id_mapping function.
[4] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123434.html




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.