Yuji Shimada <mailto:shimada-yxb@xxxxxxxxxxxxxxx> wrote:
> On Mon, 22 Sep 2008 21:47:11 +0800
> "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>
>> Yuji Shimada, how do you think to restrict the AER to host side,
>> i.e. when a uncorrectable error happen, we can kill the guest and
>> FLR the device, to avoid the whole system be desotried, at least for
>> first step.
>
> I think restricting the AER to host side is a good idea for first
> step, because implementation will be simple.
>
> By the way, Can we recover error condition by only FLR? Resetting link
> from root port is needed on some error, isn't it?
Yes, root port link reset is needed for host side. I mean FLR is just for guest
specific.
what I'm considering is add error handling to pciback, so that when host reset
the hierarchy, the pciback's error handler will be invoked and notifiy control
panel. But I'm not sure still if there are any mechanism exists for the
notification (otherwise, we need xen special mechanism). Also not sure if the
long latency is acceptable for error handling, especially it may finished after
reset link.
>
>> As to PCI-E feature in guest side, I think it maybe complex,
>> especially I'm not sure how to handle some PCI-E featue that
>> requires operation for the whole hierarchy. For example, the VT/TC
>> may requires changes to the switch and root port's setting. The AER
>> handling in Linux may require reset link from root port. IMO, it is
>> complex to hanle such feature.
>
> I agree with you that implementing full PCI-E future in guest side
> will be complex. I don't think VT/TC in guest side is needed. But, AER
I remember I saw a doc that Windows has VC/TC support for HD Audio, although
not sure how is implemented. Is VC/TC needed for communication usage?
> in guest side is required in the long term, because guest OS will be
> able to handle AER and recover error condition.
Yes, agree that if guest can do AER, it will enahnce reliability and
availability. But more elegant design is needed. For example, if guest decide
that the AER need root port reset link (switch link reset should be ok unless
SR-IOV), what shall host do? If host act according to guest's suggestion, that
may not be safe, I suspect.
BTW, do you know what will recover action usually be? I didn't find much
document on it, and the PCI-E spec didn't give much clue either.
Thanks
Yunhong Jiang
>
> Thanks,
>
> --
> Yuji Shimada
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|