[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Metadata and signalling channels for Zephyr virtio-backends on Xen



Hi Stefano,

On Tue, 8 Feb 2022 at 01:16, Stefano Stabellini
<stefano.stabellini@xxxxxxxxxx> wrote:
>
> On Mon, 7 Feb 2022, Alex Bennée wrote:
> > Hi Stefano,
> >
> > Vincent gave an update on his virtio-scmi work at the last Stratos sync
> > call and the discussion moved onto next steps.
>
> Hi Alex,
>
> I don't know the specifics of virtio-scmi, but if it is about power,
> clocks, reset, etc. like the original SCMI protocol, then virtio-scmi is

virtio-scmi is one transport channel that support SCMI protocol

> likely going to be very different from all the other virtio frontends

The virtio-scmi front-end is merged mainline

> and backends. That's because SCMI requires a full view of the system,
> which is different from something like virtio-net that is limited to the
> emulation of 1 device. For this reason, it is likely that the
> virtio-scmi backend would be a better fit in Xen itself, rather than run
> in userspace inside a VM.

Not sure what you mean when you say that SCMI requires a full view of
the system. If you are referring to the system wide resources which
reset or power up/down the whole SoC, this is not really what we are
targeting here. Those system wide resources should already be handled
by a dedicated power coprocessor. In our case, the IPs of the SoC will
be handled by different VMs but those IPs are usually sharing common
resources like a parent PLL , a power domain or a clock gating reg as
few examples. Because all those VMs can't directly set these resources
without taking into account others and because the power coprocessor
doesn't have an unlimited number of channels, we add an SCMI backend
that will gather and proxy the VM request before accessing the
register that gates some clocks IP as an example or before powering
down an external regulator shared between the camera and another
device. This SCMI backend will most probably also send request with
OSPM permission access to the power coprocessor once aggregating all
the VMs ' request
We are using virtio-cmi protocol because it has the main advantage of
not being tied to an hypervisor

In our PoC, the SCMI backend is running with zehyr and reuse the same
software that can run in the power coprocessor which helps splitting
what is critical and must be handled by power coprocessor and what is
not critical for the system (what is usually managed by linux directly
when their no hypervisor involved typically)

>
> FYI, a good and promising approach to handle both SCMI and SCPI is the
> series recently submitted by EPAM to mediate SCMI and SCPI requests in
> Xen: https://marc.info/?l=xen-devel&m=163947444032590
>
> (Another "special" virtio backend is virtio-iommu for similar reasons:
> the guest p2m address mappings and also the IOMMU drivers are in Xen.
> It is not immediately clear whether a virtio-iommu backend would need to
> be in Xen or run as a process in dom0/domU.)
>
> On the other hand, for all the other "normal" protocols (e.g.
> virtio-net, virtio-block, etc.) the backend would naturally run as a
> process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.
>
>
> > Currently the demo setup
> > is intermediated by a double-ended vhost-user daemon running on the
> > devbox acting as a go between a number of QEMU instances representing
> > the front and back-ends. You can view the architecture with Vincents
> > diagram here:
> >
> >   
> > https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
> >
> > The key virtq handling is done over the special carve outs of shared
> > memory between the front end and guest. However the signalling is
> > currently over a virtio device on the backend. This is useful for the
> > PoC but obviously in a real system we don't have a hidden POSIX system
> > acting as a go between not to mention the additional latency it causes
> > with all those context switches.
> >
> > I was hoping we could get some more of the Xen experts to the next
> > Stratos sync (17th Feb) to go over approaches for a properly hosted on
> > Xen approach. From my recollection (Vincent please correct me if I'm
> > wrong) of last week the issues that need solving are:
>
> Unfortunately I have a regular conflict which prevents me from being
> able to join the Stratos calls. However, I can certainly make myself
> available for one call (unless something unexpected comes up).
>
>
> >  * How to handle configuration steps as FE guests come up
> >
> > The SCMI server will be a long running persistent backend because it is
> > managing real HW resources. However the guests may be ephemeral (or just
> > restarted) so we can't just hard-code everything in a DTB. While the
> > virtio-negotiation in the config space covers most things we still need
> > information like where in the guests address space the shared memory
> > lives and at what offset into that the queues are created. As far as I'm
> > aware the canonical source of domain information is XenStore
> > (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> > type approach. Is there an alternative for dom0less systems or do we
> > need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> > run cleanly as a Dom0 guest) providing just enough services for FE's to
> > register metadata and BE's to read it?
>
> I'll try to answer the question for a generic virtio frontend and
> backend instead (not SCMI because SCMI is unique due to the reasons
> above.)
>
> Yes, xenstore is the easiest way to exchange configuration information
> between domains. I think EPAM used xenstore to exchange the
> configuration information in their virtio-block demo. There is a way to
> use xenstore even between dom0less VMs:
> https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> full PV drivers too. However, in the dom0less case xenstore is going to
> become available some time after boot, not immediately at startup time.
> That's because you need to wait until xenstored is up and running.
>
> There are other ways to send data from one VM to another which are
> available immediately at boot, such as Argo and static shared memory.
>
> But dom0less is all about static partitioning, so it makes sense to
> exploit the build-time tools to the fullest. In the dom0less case, we
> already know what is going to run on the target before it is even turned
> on. As an example, we might have already prepared an environment with 3
> VMs using Yocto and ImageBuilder. We could also generate all
> configurations needed and place them inside each VMs using Yocto's
> standard tools and ImageBuilder. So for dom0less, I recommend to go via
> a different route and pre-generate the configuration directly where
> needed instead of doing dynamic discovery.
>
>
> >  * How to handle mapping of memory
> >
> > AIUI the Xen model is the FE guest explicitly makes grant table requests
> > to expose portions of it's memory to other domains. Can the BE query the
> > hypervisor itself to discover the available grants or does it require
> > coordination with Dom0/XenStore for that information to be available to
> > the BE domain?
>
> Typically the frontend passes grant table references to the backend
> (i.e. instead of plain guest physical addresses on the virtio ring.)
> Then, the backend maps the grants; Xen checks that the mapping is
> allowed.
>
> We might be able to use the same model with virtio devices. A special
> pseudo-IOMMU driver in Linux would return a grant table reference and an
> offset as "DMA address". The "DMA address" is passed to the virtio
> backend over the virtio ring. The backend would map the grant table
> reference using the regular grant table hypercalls.
>
>
> >  * How to handle signalling
> >
> > I guess this requires a minimal implementation of the IOREQ calls for
> > Zephyr so we can register the handler in the backend? Does the IOREQ API
> > allow for a IPI style notifications using the global GIC IRQs?
> >
> > Forgive the incomplete notes from the Stratos sync, I was trying to type
> > while participating in the discussion so hopefully this email captures
> > what was missed:
> >
> >   
> > https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes
>
> Yes, any emulation backend (including virtio backends) would require an
> IOREQ implementation, which includes notifications via event channels.
> Event channels are delivered as a GIC PPI interrupt to the Linux kernel.
> Then, the kernel sends the notification to userspace via a file
> descriptor.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.